Scaffolding is a way to blend data. In quite a few cases, it gets the job done quite well. Thus there seems no more need to join or union data at the record level.

In recent articles, I described "Lossless Data Blending via Scaffold" and "Blending Dates via Scaffold". Both are in their simplest form: One dimensional scaffold to blend two data sources of different dimension components. The scaffolding dimension must be the superset of those same dimensions in the secondary data sources.

Again, I would emphasize that the difference between regular blending and scaffold-based blending is:
  • Blending: Loss of data in the secondary sources.
  • Scaffolding: No loss of data if we wish. Or we can choose to keep only those data of interest. The scaffold acts as the primary. All actual data sources are equally secondary.
Now, the scaffolding can help us blend the data together and show us a rather cool chart. The new question is: how do we filter it by certain dimensions?

The short answer is, we need to build the filtering dimensions into the scaffolding first. Then we can create the chart and filter the result afterwards.

There comes the multi-dimensional scaffolding. And the detailed answer follows.

Let's take the same example as in "Taking Stock with Start and End Dates". Assume we need to filter the result by Product Category and Customer Segment.

In that example, we created a single date dimension scaffolding. Now we need to add two more dimensions. The steps are as follows.

1.Create one column per dimension per sheet in Excel
So we have these 3 sheets friendly named: Date, Product Category and Customer Segment. But they could be using the default names like Sheet1, Sheet2 and Sheet3. Each sheet has a single column with header and dimension elements.
2.Cross join all the dimensions using custom SQL
A SQL one-liner suffices to generate the multi dimensional scaffolding

Select * from [Date$],[Product Category$],[Customer Segment$]

There are 2 elements in Date: Start date and End date. There are 3 elements in Product Category and 4 elements in Customer Segments. Cross joining them will generate 2x3x4=24 combinations thus 24 rows in the scaffolding.

The size of the scaffolding equals to the multiplication of the sizes of each dimension.
The next step is to make sure all the secondary data sources are blending with the primary on all 3 dimensions.

Last, by creating the same measure "Outstanding Orders" and dragging Customer Segment and Product Category to the filter shelf, we now can filter the measure and associated chart by the two dimensions.

The resulting interactive workbook can be downloaded here.

Dimension Reduction
We see that the scaffolding is created using 3 dimensions. The size of the scaffold or the number of rows are obtained by multiplying the sizes of each dimension. This number can become huge if a few of them are big. Sometimes, such a huge and bulky scaffolding is unnecessary because it takes up space and decreases performance. So we need to do some dimension reduction.

For example, in our superstore data set (depending on versions), there are 3 product categories and 17 sub-categories. If we want to filter by these two dimensions, according to the above, we seem to need 3x17=51 rows of scaffolding. This is assuming the two dimensions are orthogonal. In reality, they are not. Each category is just a label on the 17 sub-categories. And each sub-category belongs to one category only. So these two dimensions can be put in one sheet. Thus the size of the scaffolding is reduced from 51 to 17. If necessary, this sheet can be cross-joined with other dimensions.
This is how multi-dimensional scaffolding works! It can help us blend multiple data sources and build dimension filters in a very flexible way. This actually creates alternatives to union or join at the record level.
8

View comments

  1. Further tips on reducing dimension:

    1.Group hierarchical dimensions in one sheet. Time and product are two independent hierarchies, for example. So, Year, Month, Week, Day all go to one sheet. Product, Product Sub-Category, Product Category go to another sheet. These independent dimensions are also called orthogonal to each other.

    2.Pick only those dimensions that matters for the viz. No need to include all the possible dimensions.

    ReplyDelete
    Replies
    1. Also group two or more hierarchies into one sheet if necessary.

      Delete
    2. Hi Alex! I have a small doubt . How can i find out all years all months last day transaction data ( i have 4years of data) in Tableau.

      Delete
  2. A use case http://community.tableau.com/thread/190903

    ReplyDelete
  3. Nice work, Alexander! A clear and helpful explanation :)

    Thanks!
    Keith Helfrich | Twitter
    Red Headed Step Data

    ReplyDelete
  4. Nice work, Alex! I have a small doubt, what is the usage of Mock up function in Tableaud

    ReplyDelete
    Replies
    1. what is mock-up function? can you give an example or pointer?

      Delete
    2. sry for wrong entry this is not a function. I face this type of questions in my Interview not only this, their asking some different questions like....what is Moke up and what is Wire framing and Metrics in tableau

      Delete

(Refresh the page if you want to view the gif image multiple times. Or go to Tableau Public and click the button at the top-right corner.)

Jake and I collaborated on a dashboard. He told me that he learnt a way to create an in-place help page in Tableau.

(Addendum: Jonathan Drummey has a much better Tableau-only solution that I missed from his presentation. I only caught later part of the presentation.

[Forward: I asked ChatGPT o1-mini who then wrote this. Hope it helps. All the credit and the blame go to ChatGPT.

I went over the plan and it looked decent. Whether it can be done in 30 days or not, it depends on the person and the time he spends on it.

Mundane charts are those basic ones that all data visualization beginners can create, possibly with Show Me in Tableau. They are the boring ones at times because many people tend to create fancier ones just to show off.

A while ago, Sharon came to me asking a question regarding Pareto Chart Multiples. That is, per each category, there is a Pareto chart. And we need to create Pareto charts for all the categories.

[Update: The product manager Wilson Po alerted me that the Viz Extension is still a work in progress. It will not be part of the incoming version 2024.1. Instead, it will be released later in 2024. Just be patient]

Tableau 2024.1 is coming. I got a chance to test drive it.

Buzzfeed recently asked Midjourney to draw images of people in 50 US states.  So the AI drawing tool created 50 images of couples that represent its perception of the people in each state.

I just put the images into a tiled map in Tableau. Each image is added as a background in each tile.

1

The folks at Business Expert had a brilliant idea. They asked AI's perception on UK banks as a dog. I am inspired to do the same on US banks.

ChatGPT is asked to confess its perceptions on top US banks as a dog. Then Midjourney is tasked to generate the images.

Through my previous post on the new Sankey chart type, I got in touch with Wilson, the product manager leading the development of this new chart type. I made some comments on creating multi-level Sankey via cascading of single Sankey's.

As an enthusiastic user of Sankey charts, I am excited to learn that a Sankey chart type is being piloted in Tableau Public (Web Edit only). I wrote about Sankey chart design in multiple posts. Sankey chart may appear in different forms depending on applications.

Just came across a report by Reuters on USA-China gap widens between respective internet giants. The report includes a text table. The caption says the table columns can be sorted. But it is a static image. (They retracted the table after I reported the issue.)

It picked my interest.

1

In the process of creating a dashboard on the US Travel Advisory 2023, I found some mismatches in a few regions in two countries.

Gaza Strip

One is Gaza Strip in Palestinian Territories. In the latter, there are two regions: Gaza Strip and West Bank.

This is a follow up post to Fiscal Calendar Calculations Cheatsheet for Tableau.

Excel is a very important tool for data analysis and calculations. It's also an important data repository for Tableau.

Week-based calendars are used in many companies as their fiscal calendars. The total weeks in a fiscal year is 52 weeks, that is, 364 days. Each quarter has 13 weeks. There are 3 varieties of 13 weeks: 5-4-4, 4-5-4 and 4-4-5 weeks per quarter. In leap years, there are 53 weeks or 371 days.

Christine suggested me to have a look at Simpson's Paradox, following my recent posts on Anscrombe's Quartet and Datasaurus Dozen. They are all about learning to look at statistics in an impartial way.

#TweakThursday: From time to time I tweak someone else's public viz and try to make it better to my subjective view.

This post is about 13 data sets, known as Datasaurus Dozen, that have the same stats and different distributions. Stats can be deceiving while data visualization can makes a big difference.

Francis Anscombe, a British statistician and a professor at Princeton and Yale, constructed 4 different sets of data which all have the same stats, known as Anscombe's quartet. However the quartet's data distributions are quite different. 

Stats alone can be deceiving.

In a single day, I am asked twice the same question: how to install database drivers for Tableau in Mac? The question of the day is regarding the drivers for Presto and PostgreSQL databases. The docs online may not answer the question exactly.

There are always more than one ways to skin a cat. In Tableau, there is always one more way to design the same chart. Mastering them will give us more options to satisfy the various requirements we may be asked for.

Line chart is one of the most basic ones.

I almost named the post as Charting "Top N and Others" via Post-filtering. Read on to understand why.

Visualizing "Top N and Others" is an often required business use case. A popular solution is by creating a top N set. That's the one I have been using through the years.

Angel works in Finance. She often asks me questions on calculations in a table. Today I got this question: How to calculate Year over Year (YoY) change ratios for both quarterly and yearly sums, in a single sheet?

Here is the solution we got. First there are two parts for the YoY calculation.

Just came back from Tableau Conference 2022 at Las Vegas. What an exciting event! The most exciting thing is reuniting with old friends and meeting with the datafam people known online for years.

Attended first time the Tableau Visionary summit.

A little enhancement in the formula editor can make a big difference for whose who create formula all the time in Tableau. Here are my wishes for a future editor. 

Highlighting Syntax Words

Currently a formula in Tableau can look plain and a bit uninspiring.

For the sake of uniformity in a bar chart, we may need to filter out dates in the latest partial week, month, quarter or year. That is what Parinita asked me about a filter to do just that.

Before Belinda asked me about making phone calls from Tableau dashboard, she had some issues in creating an email template in Tableau. Many people might have known how to do the basics.

Belinda needs to call business partners in foreign countries regularly. She has a dashboard showing various deadlines that she has to monitor. If a deadline is overdue, she may need to talk to the partner in question. She has already integrated email in the dashboard via URL action.

This post is about labeling a trellis chart that's already in dual axis.

In earlier posts on labeling trellis chart, we use one axis for the labeling function. Many times, the chart is already in dual axis (both axis are taken).

In online Tableau literature, I noticed that most people referred to Zen Master Chris Love's formula in a 2014 post about the size of a trellis chart.

Sharon left a message in my last post on Labeling Trellis Chart Anywhere asking whether we can have one label on the left and another on the right per trellis chart cell in Tableau. Yes we can. Below we will show how to place multiple labels within a trellis cell.

Catherine came to my office asking if we can create a compact version from a sparse table in Tableau, so that the table would look a lot more compact. This allows a succinct view of the table content. It saves screen real estate and makes it easy to read for business audience.

In my previous post on labeling trellis chart, I only showed how to label at the top left corner. People like Chipo Chirewa may want to label elsewhere.

Here I would show how to label anywhere in a trellis cell, like places other than the top left corner.

3

[Sequel to this post: Labeling Trellis Chart Anywhere]

To many people, the most difficult part of creating a trellis chart is to label it. Especially labeling it in the same sheet and with sparse data is even harder.

1

Many times, Tableau is used beyond data visualization. Often we need to perform all sorts of functions. Actually, Tableau is a powerful calculator. Instead of using another tool, such as Python or Excel, we can do it in Tableau proper.

Here is a use case at work where the grand total of a table needs to be accumulated horizontally to the right.

In the table, daily sales are shown by categories. The expected result is as follows:

We will use customized grand total technique to calculate it.

The term Fill Down is from Excel where we may need to fill all the empty cells below a non-null cell with the same cell value. Excel has a Fill Down button in the menu bar for a single cell fill down. We may also have to fill down between multiple non-null cells in the same column.

In data analysis, we need to use filters here and there. In general, we would classify them as pre-filters or post-filters for better understanding of their respective mechanisms.

A pre-filter works on the data set. It only keeps the data we need for the analysis.A post-filter works on the results.

2

Note first that here I loosely define data densification as what includes both interpolation and extrapolation of data marks as well as their associated values.

[ This is a guest post by Hans Romeijn. This is a followup post to my recent post on calculating YTD/YoY, QTD/QoQ, MTD/MoM and WTD/WoW. Hans shows a practitioner's approach to the calculation with performance in mind.

I was asked a question: How to find out the IDs that showed up consecutively 5 times during the last 14 days?

How would you solve it?

Here I came up with 2 solutions. The 2nd one is a little simpler.

1

In corporate finance, bridge chart is often used to visualize itemized sales/revenue performance during a particular period, such as a quarter or a year.

Bridge chart can be designed using waterfall chart. But we will use a different approach.

A colleague posted this: "Hi Team, may I ask if you have any good idea to show the % difference of two randomly selected data points on a line chart?"

I found a solution to it, which is as follows.

The show/hide buttons in containers and also in sheets allow us to create drill down functionalities in Tableau dashboards. Actually they make it simple to drill down in more ways than before.

Drill down with fixed sized containers

Here is an example. Given a simple bar chart by category.

Subtitle: Sunburst Chart with Labels Inside and Categorical Sequential Colors

Here I am presenting how to design Sunburst Chart with practical considerations, such as:

Labels insideCategorical sequential colors with dynamic data.The design will be based on map layers, a new feature since Tableau d

[ Followup guest post by Hans Romeijn: Calculating Period-To-Date/PoP with Indicators for Better Performance ]

Year to Date (YTD) and Year over Year (YoY) calculations are very important in business dashboards. Jim Dehner recently wrote a great post on the topic.

9
Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.