In data analysis, we need to use filters here and there. In general, we would classify them as pre-filters or post-filters for better understanding of their respective mechanisms.

A pre-filter works on the data set. It only keeps the data we need for the analysis.
A post-filter works on the results. It only keeps the results we need.
In between, calculations are performed on the data, such as aggregations, ranking etc.

In SQL, 
WHERE statement defines what data to keep. 
HAVING statement defines what results to keep.

Thus WHERE is a pre-filter and HAVING is a post filter.

WHERE the pre-filter works on the dataset and defines what rows to keep with conditions like STATE='California' and CITY='San Francisco'.

HAVING the post-filter works on the results after various aggregations or calculations of data. For example, HAVING Rank>=5 to get top 5 results only. The rest of the ranking results are disgarded.

It is an important concept in Tableau that some filters are pre-filters (ex. dimension filters) and some post-filters (ex. table calc filters). I found it interesting to apply the same concept on SQL operations as well. Here is a tweet I posted a year ago:
Pre and post filters are part of SQL operations. Here is the order of SQL operations:

Feel free to leave comments or contact me at twitter @aleksoft



2

View comments

  1. My understanding is that UNION happens after SELECT and FROM happens first, at the same time as JOIN but I've yet to see how or why this matters. Just using this as reference: https://vladmihalcea.com/sql-operation-order/

    ReplyDelete
    Replies
    1. Very good question! Union can be used to combine source data sets or result data sets.

      When multiple source data sets are involved, we do need a Select * From to get the data first. In this sense, we have to put Union after Select * From.

      Sometimes, Join/Union can be done in other tools like Excel or Python before being processed by SQL. In a pure SQL environment, it's true that Union will always be after Select statement.

      Delete

(Refresh the page if you want to view the gif image multiple times. Or go to Tableau Public and click the button at the top-right corner.)

Jake and I collaborated on a dashboard. He told me that he learnt a way to create an in-place help page in Tableau. He first saw it at a conference somewhere and couldn't recall who the speaker was. So I am blogging here about it but the credit goes to somebody else. If anyone knows who the original creator is, leave a comment below.

The key idea is to float a semi transparent worksheet on top of the dashboard, where a help text box is strategically placed on top of each chart. This way, we can explain how to view each chart and what data points are important, etc. This worksheet is collapsible by a show/hide button.

(Addendum: Jonathan Drummey has a much better Tableau-only solution that I missed from his presentation. I only caught later part of the presentation. You might ask him about it if you know him.)

In a recent presentation, Tableau visionary HOF Jonathan Drummey talked about a solution for a variable row heights in a text table. The question apparently came from a perfectionist tableau designer. Tableau is not really made for text processing.

[Forward: I asked ChatGPT o1-mini who then wrote this. Hope it helps. All the credit and the blame go to ChatGPT.

I went over the plan and it looked decent. Whether it can be done in 30 days or not, it depends on the person and the time he spends on it. By the way, ChatGPT can be a really good study buddy. Ask it questions whenever you have any.]

This comprehensive 30-day plan is designed to take you from a Tableau beginner to an advanced user.

Mundane charts are those basic ones that all data visualization beginners can create, possibly with Show Me in Tableau. They are the boring ones at times because many people tend to create fancier ones just to show off. 

I actually like the mundane ones a lot because they are not only easy to create but also easy to be read by the stakeholders.

Pareto chart is a very powerful tool, providing great insights into the data set and into the business at stake.

A while ago, Sharon came to me asking a question regarding Pareto Chart Multiples. That is, per each category, there is a Pareto chart. And we need to create Pareto charts for all the categories. This chart allows us to quickly view the few most important factors that matter to the majority of output in each category. 

Vilfredo Pareto (1848-1923) is the father of the 80/20 rule: 80% of output are produced by 20% of input. It works magically well through all the years.

[Update: The product manager Wilson Po alerted me that the Viz Extension is still a work in progress. It will not be part of the incoming version 2024.1. Instead, it will be released later in 2024. Just be patient]

Tableau 2024.1 is coming. I got a chance to test drive it. As I wrote a bunch of posts on Sankey chart tutorials in the past, I am most excited by the new Sankey chart type. Here I would like to share what I learnt. This is a quick preview. Your comments are welcome.

Buzzfeed recently asked Midjourney to draw images of people in 50 US states.  So the AI drawing tool created 50 images of couples that represent its perception of the people in each state.

I just put the images into a tiled map in Tableau. Each image is added as a background in each tile.

And also I added Viz-in-tooltips to enlarge an image to look at more details.

Feel free to download the workbook and explore it.
1

The folks at Business Expert had a brilliant idea. They asked AI's perception on UK banks as a dog. I am inspired to do the same on US banks.

ChatGPT is asked to confess its perceptions on top US banks as a dog. Then Midjourney is tasked to generate the images. Check out what dog is matched to your favorite bank.

All are put together into a single-sheet Tableau dashboard. Feel free to check it out.

Through my previous post on the new Sankey chart type, I got in touch with Wilson, the product manager leading the development of this new chart type. I made some comments on creating multi-level Sankey via cascading of single Sankey's. He told me it can be done already by dropping more dimensions into the Level card.

As an enthusiastic user of Sankey charts, I am excited to learn that a Sankey chart type is being piloted in Tableau Public (Web Edit only). I wrote about Sankey chart design in multiple posts. Sankey chart may appear in different forms depending on applications. 

I played a little with it just to evaluate it. Here are my initial findings and comments.

1. The basic Sankey

I can quickly create a Sankey with 2 dimensions and 1 measure.
Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.