1. There seem quite a few other posts regarding how to get top N and bottom N. It is a recurring topic. Here is an example:

    How to create Top 5 and Bottom 5 values with respect to a Measure in Bar graph in same worksheet

    Just figured out a simple way of dealing with this class of questions.

    Top 5 and Bottom 5

    For top 5 and bottom 5, this line of code as filter will suffice: (Use Rank() if you wish)
    • Rank_Unique(Sum([Sales]))<=5 or Rank_Unique(-Sum([Sales])<=5
    or
    • Rank_Unique(Sum([Sales]))<=5 or Rank_Unique(Sum([Sales],'asc')<=5
    We are using both descending and ascending sorts. That's it. (The first formula may not work in some Tableau desktop versions. Rank_Unique(-Sum(sales)) may not be accepted).

    Sort the result to make it look better. But sorting may not be a requirement.

    Top N and Bottom N

    To generalize a bit, for top N and bottom N, we can do this using a parameter N:
    • Rank_Unique(Sum([Sales])<=N or Rank_Unique(-Sum([Sales])<=N
    Add Color

    Color is nice to differentiate. Use this calculated field to color:
    • Rank_Unique(Sum([Sales]))<=5
    Add Rank

    Create a calc field Rank_Unique:
    • Rank_Unique(Sum([Sales]))
    Right click it and make it discrete. Drag it to first column to view the ranking.

    Voila! Click below to download the workbook.
    4

    View comments

  2. Very often people need to get the measure value on the max(Date), or the min(Date), or next to max(Date), or a date relative to the min/max date etc. (Date can be in Year, Quarter, Month, Week, Day or even second.) It is often a struggle to pick out that date and its associated measure value.

    The first thing that may come to our mind is to write a good conditional like

    Date=Max(Date) ? Or use Window_Max(Max(Date))? Then datediff()?

    I don't know how those will work. But I know something that will work and it's quite simple. No need to fight with aggregate or window functions.

    To select the value associated with min(Date), we just need to write this
    • if first() = 0 then [measure] end
    We don't even need min() function in view.

    For the value associated with max(Date), the formula is
    • if last() = 0 then [measure] end
    For the second day after min(Date), the formula is
    • first() = -1
    Whether you have guessed or not, the date that is 5 days from the max(Date) is
    • last() = 5
    Below is an example that shows you the relationship between a date range and first()/last().
    From the table, you can now figure out the formula for any date in between, as long as you are given the offset from the first or last date.

    Click the above image to download the workbook.
    0

    Add a comment

  3. Two articles have been selected as Best of the Tableau Web March 2015 :

    Grouping Lower % Bars in Bar Chart
    Nested Sorting and Top N per Category via Rank_Unique

    Excited about the selection. Very grateful to the Tableau community for inspirations and recognitions!
    1

    View comments

  4. There are two aggregates in the title. One is for creating bins. The other is for calculating distribution over bins of aggregate functions like sum, average, count etc. In case of count distribution, it is also known as histogram.

    In Tableau, there is a menu function for creating bins for any numerical dimension or measure. The bins becomes a new dimension and is mostly useful for creating histogram and the like distributions. Each bin thus created is of equal size. Bins are mutually exclusive, which means that each record of data belongs to one bin only.

    In practice, we may need to create bins from an aggregate measure and of non-uniform sizes. And then we need to compute the relevant aggregation over every bin. The purpose is to slice and dice measures from a number of bins. This is a recurring question that appears in various forms, such as this one and many more from the tableau forum:
    http://community.tableau.com/message/355082
    http://community.tableau.com/thread/160209
    http://community.tableau.com/thread/160083
    http://community.tableau.com/message/346692

    We are going to review the steps for solving the problem. There are 3 simple steps.

    1.Labeling bins
    2.Windowing aggregate computation
    3.Filtering results

    Let's apply the approach to a use case from the Tableau forum. The problem statement is: Create bins on the result of Pareto analysis in a 80:15:5 proportion. Pareto analysis is a running sum (in percentage) along a sorted dimension.

    In the example, one aggregate is used in creating bins. In this example, the aggregate measure [Cumulative Share of Sales] is used. It is computed as follows:

    - Sort [Product Name] by [Sales] in descending order
    - Calculate the running % sum of Sales

    In practice, the aggregate for bins can be anything, not necessarily the same as this one.

    Step 1. Labeling the bins

    Then we use this aggregate [Cumulative Share of Sales] to create bins [ABC Bins] and label them as follows:

    IF  [Cumulative Share of Sales] <= 0.8 THEN "A.Top 80%"
    ELSEIF  [Cumulative Share of Sales] <= 0.95 THEN "B.80%-95%"
    ELSE "C.Bottom 5%"
    END

    We get 3 bins and each has a label.

    Step 2. Windowing the aggregate computation

    For each bin, we calculate the aggregated sales within that window [Windowed_Sales]:

    Case [ABC Bins]
    When 'A.Top 80%'   then WINDOW_SUM(if [ABC Bins]='A.Top 80%'   then SUM([Sales]) END)
    When 'B.80%-95%'   then WINDOW_SUM(if [ABC Bins]='B.80%-95%'   then SUM([Sales]) END)
    When 'C.Bottom 5%' then WINDOW_SUM(if [ABC Bins]='C.Bottom 5%' then SUM([Sales]) END)
    END

    Note that Window_Sum() function is not affected by the bin labels in Case-When clause. To get the result right, we need to condition the sum within Window_Sum(). We are windowing the sum for each bin actually, and adding up rows having the same label.

    The windowed result is one per row. However we just need one per bin. Next we are going to fix that.

    Step 3. Filtering the result

    As we mentioned above, we just need one aggregation result per bin and filter the rest. Thus here is a calculated field [Binned_Sales] for that:

    IF First()=0 OR [ABC Bins]!=lookup([ABC Bins],-1)
    THEN [Windowed_Sales]
    ELSE NULL
    END

    This calculated field will give us one result per bin and make the rest of rows all nulls. The following picture shows the relationship between [ABC Bins], [Windowed_Sales], [Binned_Sales], [Sales] and [Cumulative Share of Sales].
    By dragging [Binned_Sales] to the filter shelf, setting it to compute using [Product Name] and selecting the option Special>Non-null values, we filtered out all the null rows.
    Because the rows are sorted according to [Sales] in descending order, the first row in each bin will have the result of [Windowed_Sales] and all the rest of rows are nulls. The first row of each bin is calculated by this logic:
    • First()=0 OR [ABC Bins]!=lookup([ABC Bins],-1) 
    First()=0 is the first row in the first bin. The second part of the logic is for the first row in all other bins.

    Clean up

    Move the aggregate [Cumulative Share of Sales] and [Product Name] to the detail shelf . These are the dimensions we need to keep. The only dimension we need on the Rows shelf is [ABC Bins]. And last, put [Binned_Sales] on the Label Shelf.

    Here is the workbook you can download for the above example.

    There is another use case where the problem statement is: Show sum of sales for Top 10 Customers, Top 11-20, OTHERS. The same approach can be used to solve the problem. The workbook can be found following the above link.
    3

    View comments

Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.