Will use an example I used before: Counting the number of Premier League clubs per facility fee level. There are 7 levels of facility fees awarded to 20 clubs according to the number of matches broadcast on TV.
The main purpose here is to illustrate the method, instead of seeking the optimal one. One of the keys in this method is partitioning. It can be used in any kind of bins, especially those bins created on aggregated measures where advanced table calculation is not available.
The steps are as follows:
1.Binning the clubs
There are 7 levels of facility fees. Each level is a bin with a label. Rank_Dense() is used to create labels for the bins.
2.Sort the clubs and running-count the number of clubs per bin
Sort the clubs by facility fees and we make sure that the clubs with the same bin are next to each other. Without sort first, the running count won't work.
Then we are able to perform running count within each bin. The Running Count formula is as follows:
- IF [Bin Label] != LOOKUP([Bin Label],-1) THEN 1
- ELSE 1+PREVIOUS_VALUE(0)
- END
3.Get the highest count per bin
The formula for Count is as follows. This table calculation is set to compute using Club.
- IF [Bin Label]<>IFNULL(LOOKUP([Bin Label],1),'')
- THEN [Running Count]
- END
4.Filter the null values
For visualization, we need to filter the nulls in Count in which only the last running count is not null.
Voila here is the histogram we expected. The workbook can be downloaded from here.
View comments