Sunday, September 25, 2016

Filling Blanks via Multi-Pass Data Densification

I met an issue that requires to show a cross tab table in full glory. However the default view may not be complete because of missing members in dimensions (Domain incomplete) and missing data.

Here is how the original data look like:
- There are only 4 valid data marks (Abc), out of possible 12.
- There are in total 3 possible events in the Event dimension. Neither of the two sites has 3 events. That is domain incomplete in each site. There are possible 6 columns if complete.
- There are blank spots with no data marks. We would like to fill all the blank spots with zeroes.

We will use two passes of densification to fill all the blank spots.

1.First pass: Domain completion 
Create an Index field (=index()) and drag it to the Detail shelf. Set the Index to compute using Pane (Across then Down). This will complete the domain for Event. Note that both sites have the same events and data mark distributions.
2.Second pass: Data padding
Drag the same Index to just under the first Index. Set this 2nd Index to compute using IP dimension. So we see that this padded the last 4 of the 12 possible data marks. Now we have all the possible 12 data marks.
Now the above 2 passes have created/densified all the possible data marks in the table. Next we will fill the data marks with numbers.

3.Convert Null to Zero
If we drag Sum(Number of Records) to the Label shelf, we will see this:
Although there are 12 data marks, 8 of them are Nulls. So we don't see them. To fill the Nulls with 0, we need to create a calc field with ZN(Sum(Number of Records)). Drag it to the Label shelf. This will convert each Null to 0.
The workbook can be accessed here.

References
Zen Master Joe Mako's videos on densification are highly recommended: Part 1 and Part 2. More literatures can be found in Zen Master Jonathan Drummey's comprehensive post with a lot of references. This succinct post from the data school is also very helpful.

5 comments:

  1. Thanks for the explanation and example!

    ReplyDelete
  2. Thanks for the explanation.

    Although this would only work when IP is a dimension and not a measure, right?

    My table looks something like

    Month Name Sale
    Jan19 Alex 10
    Jan19 Alex 20
    Jan19 Bob 20
    Jan19 Bob 30
    Feb19 Bob 40
    Feb19 Bob 10

    I hope to get a table like:
    Jan19 Feb19
    Alex 30 0
    Bob 50 50

    ReplyDelete
    Replies
    1. Here is the solution:
      https://public.tableau.com/profile/a.m.5517#!/vizhome/GunjansSolution/Sheet1?publish=yes Just add ZN(Sum(Sales)) and Index()

      Delete