Counting Active Orders via Pivot
Counting Active Customers ]
Introduction
Things come and go, including yours truly. C'est la vie. Life started when I checked into Earth on birthday. When will it end? Will see. In between, I am considered as an active person in the national census.
We often need to count moving parts of a limited lifetime at a specific time. This is called stocktaking sometimes.
Stocktaking has multiple equivalent expressions:
- Churn analysis
- Take stock
- Take inventory
- Make inventory
- Cycle count
- Inventory audit
- Head count
- Faire l'inventaire
- Count concurrent users
- Count concurrent users
There maybe more and I don't know them all. It depends on the industry you work with. Stocktaking is referring to checking inventory at a specific location and at a specific time. It represents a pattern of recurring issues across various business operations.
Stock or inventory can be anything that is dynamic with a life cycle (in then out):
- goods in a warehouse
- hospital patients
- hotel guests
- hotel guests
- company headcounts
- active customers
- outstanding help desk tickets
- outstanding help desk tickets
There have been a large collection of similar problems:
1.Utilization, Queues, Census, Throughput for Blocks, Floors, Units, Rooms, Offices, etc. - What do you do?
2.FAQ: Open & Close Dates
Stocktaking is important to customer lifetime estimation, capacity planning, cost and net profit calculations in any organization who needs to deal with a dynamic flow of various business entities.
Here I am going to show a couple of solutions of visualizing such a problem.
Example: Taking Stock of Outstanding Orders
The example data set will be our familiar Superstore's where we will count the "Outstanding Orders", that is, the orders that have yet to be shipped. Between Order Date and Ship Date, an order is considered to be Outstanding.
Table Data Structure
Note that in the Superstore data set, Order Date and Ship Date are in the same row of Order ID. This is referred to as Lookup table (in comparison to Transaction table where Order Date and Ship Date are in separate rows, one row per transaction).
Each order (same ID) may contain multiple mini orders, one per Product Sub-Category. Each row in the data set represents one mini order. Each mini order has its own Ship Date.
What do we count?
So, to make it simple, we will count those outstanding mini-orders at any specific date.
Two Approaches
I would like to describe two approaches in solving the problem using Tableau. Both approaches are linked to Zen Master Joe Mako in the origin.
Table Data Structure
Note that in the Superstore data set, Order Date and Ship Date are in the same row of Order ID. This is referred to as Lookup table (in comparison to Transaction table where Order Date and Ship Date are in separate rows, one row per transaction).
Each order (same ID) may contain multiple mini orders, one per Product Sub-Category. Each row in the data set represents one mini order. Each mini order has its own Ship Date.
What do we count?
So, to make it simple, we will count those outstanding mini-orders at any specific date.
Two Approaches
I would like to describe two approaches in solving the problem using Tableau. Both approaches are linked to Zen Master Joe Mako in the origin.
Approach 1. Custom SQL
The following simple SQL will help convert a lookup table to a transaction table. The day we got an order, it is +1 in outstanding orders. The day we shipped one order, it is -1 in the count. (Check out this article on the use of custom SQL over Excel files.)
The following simple SQL will help convert a lookup table to a transaction table. The day we got an order, it is +1 in outstanding orders. The day we shipped one order, it is -1 in the count. (Check out this article on the use of custom SQL over Excel files.)
- Select [Order Date] as [Date], 1 as [Count] from [Orders$]
- Union all
- Select [Ship Date] as [Date], -1 as [Count] from [Orders$]
Approach 2. Scaffold
This approach will use a simple scaffold to build a common time frame. It doesn't require SQL to convert the current table to a transactional one. So it doesn't need to reshape the data. Instead, we will need to duplicate the data source.
Scaffold
- The scaffold table consists of a single column named [Record] with two values: 1 and 2.
- We need two parameters [Start Date] and [End Date]. [Start Date] must be earlier than the first [Order Date]. [End Date] must be later than the last [Ship Date].
- Using both parameters, we can build a [Date] dimension that covers the entire date range of interest. See the formula in picture below. (Need to turn on the "Show Missing Values" under the [Date] dimension).
Duplicate Data Source
We need to duplicate the data source, one for [Order Date] and the other for [Ship Date]. Then we will blend both dates with the one in Scaffold table.
- Rename both [Order Date] in one source and [Ship Date] in the other to [Date]. Have both blended with the one in Scaffold which is used as the primary data source.
- Create 3 calculated fields under Scaffold.
[Order Records]=ZN(SUM([Orders (Sample - Superstore Subset (Excel))].[Number of Records]))
[Ship Records]=ZN(SUM([Orders (Sample - Superstore Subset (Excel)) (copy)].[Number of Records]))
[Outstanding Orders]=[Order Records]-[Ship Records]
ZN() is very important because there is a lag between [Order Date] and [Ship Date]. One may have Null values in the eyes of the other. The subtraction in [Outstanding Orders] will go wrong without ZN().
- Set the table calculations of [Outstanding Orders] to Running Total. This gives the outstanding orders at any date.
The workbook including both approaches can be downloaded here. We are done with both approaches.
Following is some discussion on related topics.
Scaffold with Filters
If you need to filter the chart in the scaffold option, you need to include a few more dimensions in your scaffold. See this post for the approach.
Blending Data Via Multi-Dimensional Scaffolding
Close Date vs Check Out Date
A bit of nuance exists in the statement of different use cases:
[Close Date]: The last day the item is in stock.
[Check Out Date]: The first day the item is out of stock.
[Ship Date] in our example is like [Check Out Date].
Here [Check Out Date] is one day later than [Close Date]. For the calculation, we always need to know the first day when the item is no more in inventory. Given [Close Date] or equivalent, we need to do this:
This approach will use a simple scaffold to build a common time frame. It doesn't require SQL to convert the current table to a transactional one. So it doesn't need to reshape the data. Instead, we will need to duplicate the data source.
Scaffold
- The scaffold table consists of a single column named [Record] with two values: 1 and 2.
- We need two parameters [Start Date] and [End Date]. [Start Date] must be earlier than the first [Order Date]. [End Date] must be later than the last [Ship Date].
- Using both parameters, we can build a [Date] dimension that covers the entire date range of interest. See the formula in picture below. (Need to turn on the "Show Missing Values" under the [Date] dimension).
Duplicate Data Source
We need to duplicate the data source, one for [Order Date] and the other for [Ship Date]. Then we will blend both dates with the one in Scaffold table.
- Rename both [Order Date] in one source and [Ship Date] in the other to [Date]. Have both blended with the one in Scaffold which is used as the primary data source.
- Create 3 calculated fields under Scaffold.
[Order Records]=ZN(SUM([Orders (Sample - Superstore Subset (Excel))].[Number of Records]))
[Ship Records]=ZN(SUM([Orders (Sample - Superstore Subset (Excel)) (copy)].[Number of Records]))
[Outstanding Orders]=[Order Records]-[Ship Records]
ZN() is very important because there is a lag between [Order Date] and [Ship Date]. One may have Null values in the eyes of the other. The subtraction in [Outstanding Orders] will go wrong without ZN().
- Set the table calculations of [Outstanding Orders] to Running Total. This gives the outstanding orders at any date.
The workbook including both approaches can be downloaded here. We are done with both approaches.
Following is some discussion on related topics.
Scaffold with Filters
If you need to filter the chart in the scaffold option, you need to include a few more dimensions in your scaffold. See this post for the approach.
Blending Data Via Multi-Dimensional Scaffolding
Close Date vs Check Out Date
A bit of nuance exists in the statement of different use cases:
[Close Date]: The last day the item is in stock.
[Check Out Date]: The first day the item is out of stock.
[Ship Date] in our example is like [Check Out Date].
Here [Check Out Date] is one day later than [Close Date]. For the calculation, we always need to know the first day when the item is no more in inventory. Given [Close Date] or equivalent, we need to do this:
- [Date]=DateAdd('day',1,[Close Date])
- Select [Open Date] as [Date], 1 as [Count] from [Orders$]
- Union all
- Select DateAdd('day',1,[Close Date]) as [Date], -1 as [Count] from [Orders$]
Lookup Table vs Transaction Table
Keith Helfrich presented some interesting thoughts on these two structures. Data may come in either way:
- Lookup table
Both Order Date and Ship Date are on the same row of Order ID.
The discussion in this article assumes such a data structure. All the methods apply.
- Transactional Table
Order Date and Ship Date are recorded in different rows. That is, two rows per order. There is a single Date column and another column Type to denote Order or Ship.
Then we only need to create a calculated field [Count]:
- Lookup table
Both Order Date and Ship Date are on the same row of Order ID.
The discussion in this article assumes such a data structure. All the methods apply.
- Transactional Table
Order Date and Ship Date are recorded in different rows. That is, two rows per order. There is a single Date column and another column Type to denote Order or Ship.
Then we only need to create a calculated field [Count]:
- if [Type]='Order' then 1 else -1 end
Wrote another article on data blending via scaffolding http://vizdiff.blogspot.com/2015/09/data-blending-via-scaffolding.html
ReplyDeleteA use case of this article http://community.tableau.com/message/433911
ReplyDeleteanother use case
Deletehttp://community.tableau.com/message/381432
Hi. This is so arcane! I have a table with the evolving career of employees. When they were admitted, when they were promoted and when they retired. I need to do something along the lines you explained above, particularly know how many employees I had in each year. I was completely lost with the "Close Date vs Check Out Date" part. I will try the sql to see if it works.
ReplyDeleteBest,
Great article, straight to the point, scaleable, and complete with references!
ReplyDeleteThank you for publishing it!