Thursday, July 19, 2018

Getting around Count Distinct in a secondary data source

Tableau won't allow the aggregation count distinct or CountD() on a secondary data source. For example when we want to count the number of orders that are in a secondary data set, where each order may have multiple lines of records, we can't do it in a simple way.

There are solutions to it before LOD is available. Given LOD, this can be done a bit more easily.

Here is an example. We have a tiled map where we need to show the number of orders from each of the US states plus DC. The order data is the superstore data set included in Tableau.

The primary data set is the tiled map which has the coordinates of each states. The superstore data is the secondary data set.

In the secondary data set, let's create a calculated field One Row per Order:

This is equivalent to de-duplication of order records: AVG() will be 1 no matter how many records per order are there. That's all we need. We got the equivalent of CountD(Order) in Sum(One Row Per Order).
The example workbook is here to be downloaded for further details. In the viz, a reference is provided using CountD() from the superstore data set. We can see the numbers are the same.

There is a solution that is similar where Max(1) is used as the aggregation which is equivalent to AVG(Number of records). I don't understand why the author claimed {Fixed} won't work. {Fixed} works in the example above.

7 comments:

  1. Hi, it looks like this method does not work when you add a dimension from the secondary data source onto the filters shelf.

    ReplyDelete
    Replies
    1. Try to make that filter a context one. See if it works for you.

      Delete
  2. I have the following {FIXED [dis_reference_id_str]:

    avg([Number of Records])} when i add the field to the view i get the standard "Unsupported Aggregation" message

    ReplyDelete
    Replies
    1. Check out some workarounds here https://neebo.ai/analytics/data-blending-limitations-in-tableau especially "Remove all dimensions from the secondary data source and confirm that the linking field in the primary data source is in the view. Then, try using the LOD expression from the secondary data source. "

      Let me know if this works out for you.

      Delete
  3. Thank you!! This worked wonders!!!

    ReplyDelete
  4. Hello, i also get "unsupported aggregation" message, what did you do to solve the issue? Thanks

    ReplyDelete
    Replies
    1. Are you using {Fixed .... ? It won't work unless you place the linking field in the view (in the worksheet canvas) somewhere.

      Delete