Sunday, January 3, 2021

Creating Sankey Funnel Chart via Template

A colleague once asked me how to create a Sankey Funnel Chart a while ago. I told her to start with this multi-stage Sankey template. Then color the not-in-use branches the same as the background (to hide them). It worked but the process was a little messy.

Someone asked me again a few days ago and mentioned Ken Flerlage's post that she had some problem with. I decided to take a deep dive into Ken's approach. Here I come up with an easy template for creating Sankey Funnel Chart in Tableau. The development of this template is based on the prior contributions of Jeffrey Shaffer, Olivier Catherin and Ken Flerlage.

New in this template:

  • No extra data preparation is required.
  • The polygons are created via data densification. No need to join with scaffold table. 
  • All complex table calculations are hidden from the users of this template.
  • By replacing the data source, you will have your Sankey Funnel Chart.

Three Steps to Sankey Funnel Chart

Below are the three steps to creating a Sankey Funnel Chart via the template:
  1. Add your data source to the template workbook (Download it here). Union your data source with itself. When union may have problems (Found this issue in Tableau Desktop 2020.4), try checking "Use Data Interpreter" in the data source editor to fix.

    You can also union data in Custom SQL. See here for details.

  2. Create Step1/2/3/4/5 by renaming dimensions or creating new fields (See example below).
  3. Replace the template data source by your self-unioned data source. 
If you see something like this after replacing data source:
you need to manually sort the step dimensions in question to match the branches.

The above is using a 5-step funnel as an example. To create a 4-step funnel, just remove Step5 from the template dashboard.  In case of a 3-step funnel, remove both Step4 and Step5 from the template dashboard.

Let me show you an example below.

Funnel Data Model

First, the expected data source model for funnel visualization is as follows (borrowed from Ken's post). The key is that from Step3 and later, only one category of the prior step will have non-null data in the following step. Say, in Step4, only Qualified from Step3 is available as valid data.
The aggregated data table view in Tableau is as below:
For convenience, I am going to use the Superstore data source as my data source. First, I need to adapt it to be a funnel kind of data set as above, namely create dimensions like Step1, Step2, Step3, Step4 and Step5. For the only measure, I will use Quantity.

In 5 steps, we would like to create a funnel like All Records>Region='West'>Category='Technology'>
Segment='Home Office'>Ship Mode='Standard Class'. Then we created these calculated fields.
  • Step1 = 'All Records'
  • Step2 = [Region]
  • Step3 = IF [Step2]='West' THEN [Category] END
  • Step4 = IF [Step3]='Technology' THEN [Segment] END
  • Step5 = IF [Step4]='Home Office' THEN [Ship Mode] END
Depending on the data you have, you may need to organize your data like the above. Now it's time to replace the template data source by my reshaped data source. You may need to manually sort each step dimension to be in the desired order. Here is the resulting chart:
There you go. We just showed how to create a Superstore Sankey Funnel Chart via template. Give it a try with the template. Leave comments below if you have questions.

A wide version which is similar to Ken's viz is included here. It has a parameter which allows you to set each column to align to the top, or not.
Recently I got a request that asks for a vertical version of the funnel. Here you go with the template. If you need the mirror of the below, just Show Header of the columns. Then reverse the axis.

Caveats

Some reported issues as follows. 
1. After data source replacement, you may find a lot points of exclamation in red ! 
Solution: Locate Size(1). Right click it and select Replace References. Then a list pops up. Select Size. Voila.

2.The field Path has a ! To fix it, open it with the editor. If you see something like "Table Name 1", remove the 1. Keep only "Table Name". Done.

59 comments:

  1. Read about how to union data source using Custom SQL here https://vizdiff.blogspot.com/2018/12/creating-sankey-chart-as-easy-as-1-2-3.html It's crucial that all fields match those in the template. Especially one has to create the extra field 'Table Name' to denote that upper and lower tables differently, say, as table1 and table2.

    ReplyDelete
  2. Hi- thank you for this post! i am struggling with the 'size' which is reported in excel. How does that come about? i am trying to adapt my dataset but i have a large data set and I am unclear on how to incorporate the size column in the excel sheet so i can directly connect it to tableau. Am i supposed to have the size be calculated in tableau?

    ReplyDelete
    Replies
    1. Yes. Let tableau do it for you. In your data set, you only need to have size per row/record. Tableau will do the necessary aggregation.

      Delete
    2. Hi Alex, thanks for the reply. What is the definition of size? is it manually input by me or is it a count of the records? I am not sure i understand how it fits in.

      Delete
    3. Size here is the equivalent of a measure such as Sales, Profit before aggregation. It can be number of records. But it is not Count(Order ID), i.e., not in aggregated form.

      Delete
  3. Hello and thank you for such an amazing template! I was wondering if there was a way to make this vertical rather than horizontal for a waterfall like affect. When I swap axis, it's upside-down.

    ReplyDelete
    Replies
    1. Added the vertical version of the template for you at the bottom of the post. See if it works for you.

      Delete
  4. I would like to display the percentage of the previous step next to the count. I created a LOD calculation but it doesn't seem to display. Any idea what could be causing this issue?

    ReplyDelete
  5. Thanks so much for this post Alexander - I was trying to understand data source structure but it seems I need the .CSV for wide version. Could you share it please?

    ReplyDelete
    Replies
    1. You can view or export the data source into csv from the workbook. Download it first.

      Delete
    2. Thanks Alexander- One more question - I'm stuck on connectors charts (curves) I haven't got success - I replicated but result was two bars without curve and space between them - Do you have clue what I'm missing? :(

      Delete
    3. Not sure. Too little info. Send me your workbook if you wish.

      Delete
  6. Hello, thank you for sharing this template!

    I was wondering if it is possible to recreate Wide Version with creating Steps 1-5 as Calculated Fields (as Your example without data preparation).

    Thank you in advance, Aiste.

    ReplyDelete
    Replies
    1. I believe you can with calculated fields. Give it a try and let me know how it goes.

      Delete
    2. It worked! I've created steps based on the values of previous step (e. g. if step 2 has a rejection reason A then a null value should be created in step 3).

      Delete
  7. Hi Alexander, incredible work. This really helped me to solve a problem I was stuck on forever!

    I was wondering, is it possible to add a step before the 'Lead' bar with multiple flows into the 'Lead'?

    ReplyDelete
    Replies
    1. It’s possible. Create your flow in another sheet. Then put flow sheet and funnel together on dashboard.

      Delete
  8. Hi Alexander, thank you for sharing this incredible template. It's what I desperately need. But since yesterday morning, I'm stuck at the data replacement step :(
    I had followed the listed steps with the "Superstore data source" and it worked. But when I modified my data (.csv) to the same format as the original data you used, which includes 6 column named from "Step1" to "Step 5", the last column names "Size" which is the exact value of each row.
    Firstly, I union my data source with itself. Consideraing my data has same format with original data, I skipped Step1/2/3/4/5 and I just replaced the template data source by my self-unioned data source. But lots of measure are not existed, such as "Path", "Path(bin)", "Curve 1-2","N3 Flow size".....
    How can I solve it?
    I sincerely look forward to your reply!

    ReplyDelete
    Replies
    1. You need to self union the data table using tableau’s data source editor. If not, you need to add an extra column exactly called “Table Name”. Self union is from two tables. For the upper table, name each row Table1 and for the lower table, name each row Table2. All in the new column. Then you can proceed to replace the data source. Good luck.

      Delete
    2. Yess, I union the data in tableau's data source editor. And there is a new column names"Table name", and indeed there are "data name.csv" and "data name.csv1". Could you please take a look at my own data and workbook?

      Delete
    3. This comment has been removed by the author.

      Delete
    4. Hey Alexander - Im having similar problems to Elena. Would be very grateful for an update if you were able to solve this for Elena. Im stuck on the data preparation step.

      Delete
    5. Just for record. Here is the solution to Elena's problem:
      - in Path formula, the Table Name 1 should be Table Name
      - there is an extra Size(1). Right click on it and select"Replace it by Reference". Then select Size from the list to replace it.

      Delete
    6. Hi Anonymous, check out the solution above.

      Delete
  9. Is there an easy way to add a Step 6 to the template?

    ReplyDelete
  10. Is there an easy way to add Step 6 to the template?

    ReplyDelete
    Replies
    1. Send a twbx workbook to me if you wish. I can have a look. alexandermou2000 at gmail

      Delete
    2. Awesome. I sent an email over!

      Delete
    3. Hi all, have you find ways to add step 6?

      Delete
    4. Sorry, I haven't got time to work on this. However you can try this: 1.Reduce the 5-stage funnel template to a 3-stage one. 2.Use twice the template to create two 3-stage funnels. 3.Cascade them to form a 6-stage one.

      Reference for cascading: https://vizdiff.blogspot.com/2020/01/creating-3-stage-sankey-chart-via.html

      Delete
    5. Thank you that works!

      Delete
  11. Hi Alex
    I have a query
    Will appreciate your help
    So when we are adding path(bin) in detail tab.. Data is being just half.. And in main sheet(lead) you are dividing size by 2
    Can you please help. Me. With that

    ReplyDelete
    Replies
    1. It's a bit hard to understand your issue. Could you find me in Linkedin and connect with me? https://www.linkedin.com/in/alexmou/ We may have a zoom meeting.

      Delete
  12. Ok Alex, will ping you but can you let me know one thing please
    Can we increase the size of al Orders or teal colour in height so we can see the graph is wide way as in the end, it is being too small or unviewable

    ReplyDelete
  13. Hi Alex, I have sent you connect ion request on linkdin
    Also, having issues when I am creating 6 Or 7th step further. Curve polygon calculations are giving only straight lines instead of curves
    Can you please give template
    If you have please.. Thank u in advance

    ReplyDelete
  14. Hello! Thank you for this!
    I'm hoping to use an existing Tableau Server Data source. Any thoughts on how to create a union of existing data source?

    ReplyDelete
    Replies
    1. I.recreate server extract with union.
      2.export data into a csv and go from there.
      3.use data prep to create union and put it on the server.

      Anyway, can’t use server extract directly.

      Delete
    2. I.recreate server extract with union.
      2.export data into a csv and go from there.
      3.use data prep to create union and put it on the server.

      Anyway, can’t use server extract directly.

      Delete
    3. 1.recreate server extract with union.
      2.export data into a csv and go from there.
      3.use data prep to create union and put it on the server.

      Anyway, we can’t use server extract directly.

      Delete
    4. Ok thanks. I have the original SQL and could create a union that way but then i need to maintain my original data set and the new one. I understand. Thanks for the reply.

      Delete
    5. When you create a union in SQL, make sure you add a column called "Table Name" to name the upper and lower tables in the union. This field is being used in the template. Otherwise, it won't work.

      Delete
  15. I decided to try the SQL but I have transactional timestamp data. Can you point me in the direction of turning in into the right structure?

    ReplyDelete
    Replies
    1. Group by the dimensions to aggregate data and minimize the number of rows. Then follow the direction of this post regarding the data structure. You need to make some members of a dimension to be Null. You might create a new field for this purpose. That's about it.

      Delete
    2. Excellent. This is an awesome template. Having lots of fun with it. Thanks so much!

      Delete
  16. Hi! Me again! I'm having trouble adding an additional step. Are there any special tricks to adding a step?

    I copied everything from 4-5 to create a 5-6 along with all the applicable fields, updating to the correct steps and fields. The result is a single bar, colored as False. It should be one True size 18 "bar" and 2 false size 2 "curves."

    If I try creating tab 5-6 myself, I get an error on Label 5-6 rows that Step 1 needs to be added in order for the table calc to work. Not sure why I'd need Step 1.

    ReplyDelete
  17. hi Alex, awesome work! curious, what is the purpose of performing a union on the data source? thank you!

    ReplyDelete
    Replies
    1. The branches in the chart are drawn as polygons. The purpose of the union is for creating polygons. Each polygon is drawn by connecting a number of points. We need a start and an end point. The points in between can be derived from these two. We need two rows of data for the two points. This is just a short answer o it.

      Delete
  18. Hi Thank you very much for your post! I am wondering how to add additional stage because there are more than 5 stages for my data. I tried to replicate the fields but somehow it doesnt work.

    ReplyDelete
    Replies
    1. You can remove 1 stage from the template to make it a two-stage template. Then use it to add two stages. You can even remove 2 stages from the template and make it a single stage template. Now you can add one stage at a time. Now you can add as many stages as you like by using the various templates repeatedly.

      Delete
  19. Hi Alex, thank you for your work, really helped me.
    But currently I'm stuck at adding percentage to each arms, can you give me the tricks?
    thanks before! :)

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. This comment has been removed by the author.

      Delete
    4. Here you go with an new design have percentages. https://public.tableau.com/app/profile/a.m.5517/viz/SuperstoreSankeyFunnelV2/Dashboard
      I have yet to write up something about it. This is a very modular design with much reduced number of fields which are reusable in every module.

      Delete
    5. Try this where I added percentages. https://public.tableau.com/app/profile/a.m.5517/viz/SuperstoreSankeyFunnelV2/Dashboard

      Delete
    6. Added percentage to the template. Check it out.

      Delete
    7. Added percentage to the template. Check it out.

      Delete
  20. This comment has been removed by the author.

    ReplyDelete