Data reshaping is about a single data source. By scaffolding, we can alter or transform the data structure in order to create visualizations that was not straightforward using the original data set.
Again, Zen Master Joe Mako has lectured about scaffolding in an hour long video focusing on data reshaping or dealing with a single data source. He has included 4 examples. Here we are going to include 2 more examples.
Why alter the data structure? Because Tableau requires the data set to be in certain structure to be rendered as charts and tables. In other words, we need the right dimensions which can generate the required number of marks.
Example 1. Data Padding Via Scaffold
In this use case, some manager positions are missing in some regions. The rule is to make the assistant manager to be the acting manager. Let's see how this can be done.
So the original data set is as follows
We can see that some of the manager positions are missing. Not all regions have managers. The desired result is as follows:
Let's create a scaffolding which corresponds to the desired structure:
It has 2 dimensions and no measures. All we need to do is to fill the values in this scaffold.
1.Create the scaffolding table in Excel. (Here we have a small table with only 2 dimensions. If more dimensions, see this article for creating multidimensional scaffolding.)
2.Import this table to Tableau as a data source. It will be used as the primary data source.
3.Blend the data set with this scaffolding table.
4.Create a calc field [Employee] in the scaffold, which is a reference to the secondary data set.
5.Create a calc field [Emp++] in the scaffold:
- IF ISNULL([Employee]) and ATTR( [Title] )='Manager'
- THEN WINDOW_MAX( [Employee] )
- ELSE [Employee]
- END
Thus we get the result as expected. The workbook can be downloaded here.
Example 2. Creating a Dimension for Measure Names
It is often desirable that we can use [Measure Names] as a dimension. But we can't. Then scaffolding comes to the rescue. Here is a real life question that Joe Mako helped in solving the puzzle via scaffolding.
The data set is as follows. It is a survey on 3 questions which respondents will answer with yes or no. There are a number of respondents during 4/1/2015 and 5/23/2015.
The goal is to tally the percentage of yes or no per day per each survey question like this:
In the original data set, each question is a column with attribute. There are 3 questions. What we need is a single [Survey] dimension that contains the 3 questions with [Answer] like yes or no. So we build a scaffolding like:
Then we need to build measures like Count and Count % per question per day.
1.Date is created through [Record] with 2 records, that allow us to create Start Date and End Date via 2 parameters. Through Show Missing Values, we get all the dates between the 2 dates. This will allow us to view data per day, per week, per month etc.
2.Create [Answer (Fill)] and [Survey (Fill)]
These two new fields are going to fill every data marks in the table. Total() function is a way to densify the data to where there were no data marks before.
- Answer (Fill) = Total(Max(Answer))
- Survey(Fill) = Total(Max(Survey))
Set them to compute using Date.
4.Thus we get our chart and table to visualize the survey result per day.
Add a comment