1. Anywhere on the dashboard, I mean. This is a sequel to Creating Visual Tooltips.

    In that post, we mentioned some limitations of the method:

    - Data marks, when under a tooltip sheet, are insensitive to mouse hovering
    - Transitional lighting effect before a tooltip shows

    Here we present a solution by which those limitations can be minimized or removed. The new solution will use a method based on vertical containers, in which, a sheet will collapse if not selected.

    The main steps are as follows, assuming you have all the tooltip sheets ready.

    1. Create your primary worksheet and drag it to a dashboard canvas. It could be a map or a chart where you want to add some visual tooltips.

    2. Drag as many containers (in Floating mode) as tooltips into the dashboard canvas. Place them near the relevant data marks, or even on top of them if you wish.

    3. Drag each tooltip sheet (in Tiled mode) into one vertical container. Set the sheet to Entire View.

    4. Set up the action filter.

    5. Publish it to the server. (The tooltips are only fully functional on the server.)

    Here are a couple of examples using the technique. Click images to view or download the interactive versions.

    Tableau Ambassador 2016 V2.0
    Bar Chart with Visual Tooltips V2.0
    Creating the tooltip sheets

    1. Create a master tooltip sheet (that is going to be filtered by an action filter on a few dimensions).

    2. Duplicate it as many times as necessary for each data mark that you want to place a tooltip. (I suggest only duplicating a few first. Test them until the design is stabilized.)

    3. Create a filter for every tooltip sheet. (I create a single member set as the filter.) Place the filter in the filter shelf.
    4. Set each filter to be a context filter. This may make the tooltips show up quicker.

    In the ambassador viz, there are 13 members so that we need to create 13 tooltip sheets. In the bar chart example, we need to create 4 tooltip sheets.

    Check them out. The tooltip containers are placed randomly. Some are on top of the data marks they are associated with. But the area in the data marks are still sensitive to the mouse (Only on server though).

    The transitional lighting effect is minimized. Now we might see some very thin lines lighting up briefly. It is not as visible as in the original ambassador viz or in the original bar chart example.

    Discrepancy between Tableau Desktop and Server renditions

    It is a bit of surprise that the desktop and server render the workbook differently. I would expect some difference but not so significant.

    On desktop, a tooltip sheet on top of a data mark won't show up if I hover my mouse over the data mark.

    Instead, on server, there is no problem. A rather curious discrepancy.

    That's all. Hope that it can be useful to you guys.


    0

    Add a comment

  2. [Update: It is reported the tooltips are not working in 10.4. I tested it in 10.5 and confirmed that it is not working either. Will update if there is a workaround.

    I wrote a request for bringing the feature back to new Tableau. Please vote it up here
    https://community.tableau.com/ideas/9126
    ]

    [Joe Oppelt has figured this out and mentioned this in a forum discussion. At the time, I just didn't understand what he was talking about. I even suggested him to write a post. Much of the credit goes to Joe.]

    This is a sequel to Seamless Sheet Swapping and Sheet Swapping with Pie/Tree Map/Packed Bubble Charts with update on interactivity. Sheet swapping used to have a shift problem if multiple sheets are all crammed into a single container. It can be made seamless with no shift by laying sheets on top of each other.

    After the publication of the above post, Joe Oppelt, a fellow Tableau Ambassador, pointed out to me that, it has some issues with interactivity like tooltips. Only the tooltips on the top sheet will show if selected. The tooltips on the other sheets underneath will never show.

    Joe and Matt Lutton made a highly popular video on sheet swapping. In a video conference session, Joe gracefully showed me some wizardry using horizontal containers plus pop-up sheets to solve the problem, allowing interactivity for all the sheets. You can view and download Joe's demo workbook here. To this day, I still wonder what is the mechanism behind.

    Joe also told me that, with his design, we won't see the interactivity in Tableau desktop. It shows only after being published to the server. (What a curious discrepancy between Tableau desktop and server!)

    The discussion with Joe led to an alternative solution which seems much simpler. Still it works only on server.

    The solution is to have one container for every sheet. Then overlay the containers on top of each other. And we use vertical containers.

    Assume we have 3 sheets that are supposed to be displayed one at a time.

    Here are the major steps for creating seamless sheet swapping dashboard with tooltips:
    1.Set up the selector as a parameter which is described here
    http://kb.tableau.com/articles/knowledgebase/creating-sheet-selector-for-dashboard
    In my example, I assigned the parameter values as 1,2,3. The descriptions are Line, Bar and Circle.

    2.Create a filter and set it up as described in the above KB article.

    3.Drag 3 Vertical Containers (in Floating mode) to the Dashboard canvas. Place them anywhere without overlap.
    4.Drag one sheet into each of the containers (in Tiled mode). Make each sheet to be shown in Entire View if you wish.

    5.Set x,y,w,h parameters of each vertical container in the layout manager as follows:
    We are using an 800x600 canvas. So the containers are of the same size. Note this is a different layout from the one in previous post.

    6.Publish the workbook to the server to view the tooltips!

    Click the following image to view or download the interactive workbook.
    Why vertical containers?

    When the sheet inside the vertical container is not selected, the sheet will collapse vertically into a thin line of 13-pixel high. Since there is only one active sheet, we will see the sheet in all its glory. There is no sheet on top of others!  Hence all its tooltips are responsive.

    A non-selected sheet inside a horizontal container doesn't seem to collapse laterally. It stays open with visual transparency and blocks any sheet underneath from being interactive. That's why vertical containers are a better choice. I found out that we can make the sheet inside a horizontal container collapse by adding a blank object to the same container. It takes a few more steps than the vertical approach. An example is included in the workbook.

    In containerless sheet swapping, the non-selected sheets become transparent instead of being collapsed, thus blocking the interactivity with the mouse. Only the top sheet remains sensitive to the mouse.

    Note that the workbook includes another dashboard which has the following layout. In this way, it minimizes the collapsed space from 13-pixel to be 9-pixel high.
    That's it!
    7

    View comments

  3. Box Plot is a highly effective tool for data analysis invented by the great statistician Professor John Tukey. It gives us the following things:
    - Partition of data by quartile.
    - Visual spread of each quartile
    - Descriptive statistics: max, min, median, upper & lower quartiles.

    Quartile is a higher level of details that allows us to understand data at a summary or aggregation level. What made box plot so popular is its simplicity. Maybe 4 is the magic/optimal number of partitions that we human can grasp the most quickly.

    Box plot has been widely used in gaining insights into one dimensional data. By applying it to spatial data, we try to add one more dimension to box plot. This allows us to create a quartile-based summary view of spatial data.

    Below is an example in which we applied the technique. Click image to view or download the interactive workbook.
    The above is an example of visualizing the distribution of disease rates (cases/population) in various counties of California. The rates are partitioned into quartiles. Then we use the quartiles to color the map. This helps us gauge the distribution and gain instant insights into the data at a summary level.

    We will give details of the calculations surrounding the application of box plot next. 

    The major steps for creating the chart are:
    1.Drag County to the detail shelf.
    2.Create a percentile ranking for Rate.
    3.Create a calculated field Quartile based on the percentile.
    4.Drag Quartiles to the Color shelf.

    We are basically done here. Simple stuff.

    We can see that the red area are of high incidents or occurrences of Chlamydia including San Francisco, San Diego and Los Angeles counties.

    A few extra and optional steps can be included to help illustrate the data.

    Legends

    The spread of each quartile is an important feature for the viz. The legends is a good place to display the data range.

    The range calculation involves some table calculations for max & min of each quartile. Here is how we calculate the max for each quartile:
    The range is actually not on the legend label. Otherwise the color may change with dynamic data. So we put it in a separate table. We use bar chart to visualize the quartile spread.
    Tooltips

    We can put those descriptive stats in the tooltips if we wish. Here is how we calculate them:
    The resulting tooltips is like:
    Discrete vs Continuous Color

    The above uses discrete color for the quartiles. Since the quartiles are actually partitions of a measure, we can have the option to use continuous color scheme. Then we use this formula to designate quartiles:
    And this is how it looks with continuous color. The continuous color scheme is actually more intuitive! It shows the contrast between quartiles. Pick discrete or continuous color at your own discretion. Click image to view or download the interactive workbook.
    Box plot as reference
    You may notice that we put a box plot on the viz. It is for reference. Through action filters, we know exactly where the disease rate for each county is on the scale.

    That's all.
    4

    View comments

  4. [Update: A new method has been posted
    http://vizdiff.blogspot.com/2016/10/counting-customers-who-bought-both-and.html]

    In marketing nomenclature, this is called market basket analysis. The analysis is aimed at understanding customer purchasing behaviors and the correlations between products.

    In Tableau Knowledge Base, there is an article about the topic.
    Create Views for Market Basket Analysis

    The suggested approach is good for relative small data set. Note that self joining will generate a much larger data set. In that specific Superstore data set of ~10K rows, it generates 150K rows after self joining. That is a 15x increase. The actual multiples are in proportion to the square of the number of products.

    In our use case, we have data sets with millions of rows and a large selection of products, the approach may create a huge amount of data that will slow down Tableau a lot.

    Here we propose an approach that does the aggregations before visualization. In the example, we use the Superstore data set in Excel. In our case the data is in a database. The SQL code is similar though.

    The main steps are:

    1. Load your data through the legacy connection which enables the Custom SQL feature.

    2. Use this Custom SQL code to create data set. This will generate 289 rows given 17 Sub-Categories of products. That's all we need for visualization.
    • SELECT COUNT([a.Customer ID]) as [Count], A, B FROM 
    •     (SELECT DISTINCT [a.Customer ID],[a.Sub-Category] as A,[b.Sub-Category] as B
    •     FROM [Orders$] a INNER JOIN [Orders$] b
    •     on [a.Customer ID]=[b.Customer ID] )
    • GROUP BY A,B
    Note in the Tableau KB article, the self joining is equivalent to the code in lines 2 - 4 above. Adding two more lines of SQL code (highlight in yellow) will save us a huge amount of data load. What it does is all the aggregations per pair of product sub-categories. This can make a big difference in visualization speed.

    3. As a result, we will get a matrix like this. You can then base your viz on the data like the one in the KB article.
    All the following steps are optional. They are about to further remove the redundancy in the above matrix which is symmetric. Those numbers on the diagonal are not needed.

    4. The numbers on both sides of the diagonal are the same. We only need those data below the diagonal. Let's create a filter to remove the data on the diagonal and above it.

    First create a calc field [OrderA]: (See Coding Case Statement Made Easy)
    Create [OrderB] in the same manner.

    5. Add the [Filter] below and we will remove those redundant numbers.
    • OrderA>OrderB
    6.Here is the viz based on the data. Click image to view or download the interactive version.
    And voila. The emphasis is that we can move most of the calculations to the data preparation stage. The resulting data set for visualization is quite small. That is what Tableau is best at.

    Alternative filter

    Another way to filter the matrix (without much coding) is to create IndexA and IndexB computing respectively along A and B. For example, IndexA (Similar for IndexB):

    Then the filter is IndexA>IndexB. In the table calculation settings for the filter, IndexA and IndexB are set to compute respectively along A and along B.
    That's it!
    7

    View comments

  5. There came a question in Tableau forum a few days ago:

    How to count current (or active) customers who ordered and to whom we delivered goods within last 4 weeks? A corollary question: what are the count of active customers at a past date?

    This belongs to a general class of similar problems, which are all about counting.

    - The number of active website users who logged in within one month
    - The number of active sessions within 15 minutes
    - The number of active customers who shopped at our outlets in the past 6 months

    The list can go on. They all share the same characteristics:
    - A new active period may start before another ends.
    - The "Active" period has a fixed length following a starting date/time. After that, the status expires.

    Let's see how the number of active customers in the beginning example can be counted and visualized over time using Tableau.

    The data set consists of Customer IDs and a list of Delivery Dates for each customer. Those are the only two columns we need.
    The main steps for counting the number of active customers are:
    1. Create a calculated field [Expiration Date] by adding 28 days (4 weeks) to the [Delivery Date]

    2. Pivot the above two columns  [Delivery Date] and [Expiration Date] to become two new columns: [Date] and [Date Type]. See Counting Active Orders for details of doing this. (Go to the bottom of this post if you want Custom SQL for automation.)

    3. Create a calculated field [Value]
    4. Create a cumulative sum with a smoothing function [Running Value].
    When two delivery dates have less than 28 days in between, the cumulative sum may become 2. But we only have one active customer. The smoothing function makes sure that there is only one active customer.

    This is done at per customer level. We can visualize the periods where each customer is active. The following chart shows the contrast between RUNNING_SUM(SUM([Value])) and [Running Value]. The latter (Orange line) will always be either 1 or 0.
    5. To count the total number of the active customers, we need to use a Window_Sum() to sum them up along the [Customer ID] dimension.
    As usual, the Window function creates many rows of the same number. We just need one row. That is why we use First()=0 to keep only one. Note that we need to turn on "Showing missing values" in Day(Date) to show all the dates and make the pill to be discrete. Click image to download the workbook.
    That's it. We got a time series which shows the turnover of active customers over time.

    Regarding Data Preparation

    Given a fixed period of being active, we can derive the expiration date from the delivery date. So, we have three basic columns with both dates in the same record. Then we need to pivot the table so that we have one row with the delivery date and another with the expiration date.

    Unfortunately, the pivot function in Tableau's data source editor doesn't apply to calculated fields. To solve this, we can export the three-column table to a CSV file and re-import it to Tableau. Then we can apply the pivot to it.

    The above technique may not lend itself to automation in some cases. To fully automate it, we have to use Custom SQL as described below:
    The above SQL does both the calculation of the Expiration date and the pivot. Add more fields to the Select clause if you want to slice and dice the data afterwards. The SQL is for Microsoft's Jet SQL engine. The code may be a little different for different databases.
    6

    View comments

  6. This is a sequel to Taking Stock with Start and End Dates. We will present a new way of taking stock. It will make the calculation much easier.

    Counting active orders is equivalent to taking stock. An order is active before it is shipped. So an order is like part of the inventory before it leaves the warehouse.

    In the previous post, we presented 2 approaches to solve the problem. Here we will show yet another solution: use the native PIVOT menu in the data source editor to solve the problem (Tableau 9+ only). The advantage of the new method is there is no need of custom SQL or scaffolding anymore.

    The data source is the Superstore data in Excel. There are multiple versions of it. We use the same version as in the last post.

    The main steps are:
    1.Load the Excel data into Tableau. Select both Order Date and Ship Date columns. Click the drop down menu at the upper right corner and select Pivot.
    2.We get two new columns: Pivot field names and Pivot field values.
    Note that after the pivot, each original row is turned into two rows: one row for Order Date and another for Ship Date.

    3. Rename the Pivot field names to be Date Type, and Pivot field values to be Date. This is for better readability.
    4.Create a calculated field Value. This assigns value 1 to Order Date and -1 to Ship Date.
    The logic here is, starting from an Order Date, we have an active order. On a Ship Date, we removed an active order.

    5. Drag Value to the Row shelves and right click Sum(Value) to select Quick table calculation>Running total. The logic here is, an order will remain active until it is shipped. The running total will give us the daily count.

    6. Using Date and Value, we get the chart below which shows the number of active orders at any given date. Right click Date to turn on "Show missing values" so that we can see values for every date. Note that we hid the unused fields in the workbook.
    It is done! We can change Date to be Month or Week to see the number of active orders per Month or per Week.
    Click the image to view or download the workbook.

    Conclusion
    The Pivot feature in Tableau 9 allows us to reshape data easily, without writing Custom SQL or scaffolding.
    3

    View comments

  7. [Update] Found a new way of doing the same. It is by pivoting. Check out my video.


    [Measure Names] is sitting in the Dimensions shelf in Tableau. But it is not a true dimension, because we can't do table calculations along it for example. So we have to resort to some unusual technique to make it a dimension.
    Someone got a single row of data with multiple columns of which most are measures. Those are the result of the upstream filtering and aggregation. He doesn't have much authority in changing the upstream logic. He asked me this:

    "How to display them in bars (one measure per bar) with tooltips showing each measure's % of total over the sum of all the measures."

    The requirement is not so obvious for Tableau implementation. It requires some sort of table calculations.

    Note that the single row data is not in an Excel file. It's rather the result of a database query. So, we can't transpose or pivot it easily, because the data source editor in Tableau has got that pivot option for Excel data only.

    Here is a simple solution I gave him which met his need.

    1.Create a single column [MetaName] in Excel which lists all the measure names.
    2.Import it into Tableau and use it as the primary data source "OneColumn".
    3.Create a calculated field [MetaMeasure] like
    The above can be done by drag-n-dropping the measures from that OneRow data source to the calc field editor. Note that we have to wrap [MetaName] with ATTR(). Ignore the popup message by Tableau BTW at this step.

    Now we have a two-column table. But still, we can't calculate % of total using the quick table calculation in right-click menu. Possibly, it is because it involves a secondary data source.

    4.We have to create the formula manually. Here is the formula for calculating the % of total for each measure, computing along [MetaName]:
    • [MetaMeasure]/Window_Sum([MetaMeasure])
    Put this in the tooltips or label shelf and we will see the correct values. An example is included. Click the image below to view or download the interactive version.
    A few side notes here:
    1.Between primary and secondary data sources, we don't need any relationship in data dimensions. The reason is we are working at the aggregation level, not at the level of details.

    2.This is an extreme example with a single row of data. Actually, when there are more rows, all the aggregations should be done within the secondary data source. Then create a single column to glue the aggregated measures together.

    The technique we used here is called data reshaping or scaffolding. I have applied the same trick to create this Funnel chart:
    Towards a Perfect Funnel Chart

    More on data scaffolding
    Data Scaffolding in Tableau
    6

    View comments

  8. This post follows a few others on the same topic:
    (Click image to enable interactivity)

    Box plot has been an effective way to give us quick insights into the distribution of data. It is done by partitioning the data by quartile. Here we apply it to the visualization of the disease rate (cases/100,000 population) in California's counties.

    So in the above chart, we can see that all the counties are colored by one of the 4 quartiles. We quickly get a sense of which county belongs to which quartile.

    On the left, we have a box plot which shows the scale of the rate for each county, overlaid with a box plot.

    And we have 3 filters: Disease, Year and Sex. Depending on these 3 variables, the distribution can be very different.

    In another post, we will explain how it is done.
    0

    Add a comment

  9. This is part of my continuous experiment to better visualize spatial data. An earlier of post is here:
    It was a revelation to me when I saw the California population distribution like above. Half of the Californians are squeezed in between Los Angeles and San Diego.

    When I was working on visualizing the disease cases and ratios in California, I wanted to see the spatial distribution of the disease cases. And I used Pareto method to help create contiguous quantiles (equal portions in population) on the map, from which resulted the above picture. More examples are as follows. We can use the Pareto measure to create various partitions as we wish.
    This is a quasi-quartile partition of the populations based on county data. We see that Los Angeles county got about one quarter of the total population. No wonder it is the most populous county in the state.
    This picture shows that the north of San Francisco Bay Area is quite sparse with only 5% of the population. About a quarter of the population live between San Francisco and Los Angeles. So, we got good insights into the population distribution in California.

    The Visual Pareto Approach

    It is custom to analyze data distribution along the North-South or East-West axis. For example, the population density is more on the east and west coasts than in the inland. This is an East-West/longitudinal analysis approach because of the particular geography of the United States. For a coastal state like California, we may be interested in understanding the North-South/latitudinal distribution of the population.

    In the above examples we use the North-South approach. The county population data in California is scraped from here. The county latitude data is generated from Tableau but has to be exported to an Excel file and re-imported into Tableau, because we couldn't directly reference the latitude(generated) in our calculation. (What a waste! Thus I created a request for the feature. Seems other people already asked for it long time ago. Please vote for them if you believe it's useful.)

    Assume Pareto cumulation will be computed North-South along the latitude axis. We can create a regular Pareto chart as below.
    The Pareto chart may tell us a few things. But it's not so intuitive about the population distribution. In combination with map partition, we create something that's easy to visually interpret.

    The main steps to create the above partitioned maps are:

    1.Create a table to get the latitude of all the data points (counties here). Export it to Excel and import it again into Tableau. Then the latitude data is made referenceable.
    2.Use this data source as the primary to blend with the original data set from which we can get measures (such as populations or disease cases).

    3.Put County in the Detail shelf and sort it by AVG(Latitude) descendingly from North to South.

    4.Drag Population to the Detail shelf and set Sum(Population) to quick table calculation of Running Total. Set the table calculation to compute using County. In this step we may create a few more measures such as percent of total to show in tooltips.

    5.Create partitions to color the map. (The calculation is based on the Pareto method.) Then, drag the Quartiles (in this example) to the Color shelf.
    Voila. The workbook can be download here.

    0

    Add a comment

Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.