Sunday, August 10, 2014

Revisit of 5 Rank Functions in Tableau (Part 1)

[Note: Part 2 is at http://vizdiff.blogspot.com/2014/08/revisit-of-5-rank-functions-in-tableau.html]

There are multiple ways to rank data. Tableau provides five different rank functions.


We can call the rank functions from 3 places:
  • Calculated Field
  • Table Calculation 
  • Right-Click* Menu
*Right-click a measure.

1.Calculated field
In creating a calculated field, we can use rank functions which are part of the table calculations. There are five of them:

2.Table calculation
Rank and Percentile are two of the table calculation types.We can access them by right or left click a measure and add table calculation.
Rank type includes four sub-types of rank functions:
  • Competition (1,2,2,4)
  • Modified Competition (1,3,3,4)
  • Unique (1,2,3,4)
  • Dense (1,2,2,3)
The competition rank is the regular rank() function.
3.Right-click menu 
Both Rank and Percentile can be called by right-click a measure from Quick Table Calculation>Rank or Percentile. Here Percentile is used for ranking.


Note that by right-clicking menu Measure>Percentile, you can pick one of the percentiles as in the picture. There are 7 ready-made percentiles (5%,10%,25%,50%,75%,90%,95%). Here percentile is used as a type of aggregation and not as ranking. It will produce a single value at each of the 7 percentiles.


Let's use the data set from Premier League 2014 to illustrate the usage of rank functions. Regardless of the final league standing of the 20 teams, Premier League allocates Facility Fees to each team depending on the number of matches broadcast by UK network. It's an indicator of the popularity of each team.

From the last column, we see that there are 7 different levels of facility fees. Liverpool is the top 1 team which got the highest fees. It's the proof that they just had a great season with a series of exciting games.
The chart shows the behavior of each rank function.

  • Rank_Unique() is the diagonal. 
  • Rank_Modified() is bottom-biased so that it tends to have larger numbers. Then it is above the diagonal. 
  • Rank() is top-biased so that it tends to have smaller numbers. Then it is under the diagonal.
  • Rank_Dense() just ranks the levels in dataset using the smallest numbers possible. Then it is the closest to zero, way below the diagonal.

Rank() and Rank_Modified() will meet at ranks where there is no duplicate. In the above graph, Spurs is the only club ranked at 6. Rank() and Rank_Modified() thus meet each other.

An interesting discovery is that Rank() and Rank_Percentile are equivalent in scale but with different notations! Will see how to prove it mathematically. I will discuss it more in the next blog.

1 comment: