1. Zen Master Jonathan Drummey left a comment in Creating Bins Made Easy, saying that he is not using "IF ... ELSEIF..." statement when creating bins of equal size. Instead, he just creates a calculated field with some math wizardry. How brilliant! Jonathan's repertoire is infinite. Let's see how it works.

    We use the same example as in Creating Bins Made Easy where we need to create bins of equal size on percentiles. The number of bins is 25. And we want to label the bins from 1 to 25. (If your number of bins is N, just replace 25 by N in the following.)

    1.Via Ceiling()
    The formula for the calc field is
    • Ceiling(Percentile*25) 
    Ceiling() is a new function introduced in Tableau 9. However it doesn't work with table calculations. What a pity! Percentile is usually generated using Rank_Percentile() which is a table calculation. This means that the use of Ceiling() is limited.

    2.Via IIF()
    The good news is, this formula works with table calculation:
    • IIF([Percentile]*25 = INT([Percentile]*25), [Percentile]*25, INT([Percentile]*25)+1)
    This is a way to achieve the same as the ceiling function. And it works with table calculations! What it does is, if [Percentile]*25 is an integer, then the bin label (the ceiling) is the integer itself, else it is the floor INT([Percentile]*25) plus one.

    Unfortunately, the equation [Percentile]*25 = INT([Percentile]*25) is often not exact due to the low precision calculation in Rank_Percentile(). The generated percentile can be a bit off. For example, a theoretical 20% percentile could be 20.000000001% and thus may go to another bin instead of the expected one.

    3.Via IIF() Revised

    Theoretically speaking, both Ceiling() and IIF() approaches can work wonders. Due to the precision limitation in Tableau's calculation, the created bins are not as exact as expected. Precision wise, Rank/100 can do better than Rank_Percentile().

    To remedy the problem, I revised the equation as

    • INT([Percentile]*25*1000000)/1000000 = INT([Percentile]*25) 

    or equivalently

    • INT([Percentile]*25*1000000) = INT([Percentile]*25) *1000000

    The trick is to use only the more significant digits of the decimals and remove the noisy bits resulting from the calculation of Rank_Percentile(). When I format the percentile values to have 8+ decimals, the noise starts to show.

    So the ceiling formula will be revised as 

    • IIF(INT([Percentile]*25*1000000)/1000000=INT([Percentile]*25), [Percentile]*25, INT([Percentile]*25)+1)

    PS. The judicious choice of using 1000000 or 1000 as a factor depends on how tolerant you are to the computing noise. If using 1000000 which will keep the 6 most significant decimals, only something like 20.0000001% will be put in the same bucket as 20%. If using 1000,  20.0001% will be in the same bucket as 20%.

    Included is a workbook that you can download and play with.
    1

    View comments

  2. This is a sequel to Coding Case Statement Made Easy.  This topic came up when a colleague needs to create 25 uniform bins on percentiles. Quite a bit of manual work, isn't it?

    It is pretty easy if you create bins of equal size on regular measures or dimensions. You can use the right-click context menu "Create Bins" in Tableau to create them.

    It becomes tedious when working with aggregated measures (no context menu available for creating bins), especially when the number of bins is big.

    The expected bins are like those in the picture:
    There are in total 25 bins between 0 and 1. It takes quite sometime to do it by hand. The accuracy of the code is not guaranteed.

    So, Excel can do this pretty handily.

    1.Create column A for the bin ticks. Excel can do it by incrementing 0.4 per row. Just by dragging down.
    2.Create column B of bin labels from 1 to 25. By dragging down again.
    3.Create column C for the code based on A and B through the function Concatenate(), plus dragging down.
    Then copy the code to the bin creation formula with some edit.

    This applies equally to creating variable-size bins. Then you have to create columns A and B manually.

    PS.I thought everyone knew this trick in Excel. But it seems not so. Hope this helps.

    5

    View comments

  3. The title seems to imply that coding case statement is not easy. It's actually more tedious than difficult. It's mechanical, manual and repetitive. The more options in the statement, the worse it gets.

    Let's use a recent example where the expected case statement is as follows:
    Manually writing the above code is quite tedious. Moreover, manual coding those options is error prone. Using tools to help generate clean code can avoid those errors and save time in debugging.

    There are two ways by which we can make it much easier to generate the code.

    1.By Excel
    Copy the values of [Measures] using menu Analysis>View Data and paste them in Column A. And use Concatenate() function to generate Column B:
    • Concatenate("WHEN '",A1,"' THEN SUM([",A1,"])")
    2.By Tableau
    Yes, by creating a calculated field. See the picture below:
    Then we need to copy the generated code to the formula editor. To do this, go to Analysis>View Data. Copy the column that contains the code. Then paste it to the formula editor for calculated fields.

    Note that, when the dimension of interest is [Measure Names], we can't directly create a calculated field from it. Then use this process:
    1.Drag Measure Names to Rows and Measure Values to Text shelf.
    2.Go to menu Analysis>View Data to copy the column Measure Names
    3.Ctrl-V (Paste) it to a new sheet and create the calc field there.

    BTW, this applies to some of the if..elseif..elseif statement as well, as long as it is uniform in format.

    Hope this help save you a few minutes of your life!
    2

    View comments

  4. The Problem

    When dealing with large data sets in a .tde file, the responsiveness of Tableau is reduced to the minimum. The design process may become intolerable. Every bit of change or drag or drop takes long time for Tableau to calculate and to render. The viz designer (aka me) has to wait and wait.

    All I wanted is to get the logic right first. I need to move pills around freely and create a few calculated fields just for trying. I need quick response from Tableau. I need my design process to be fluent. I don't mind the data sets being small.

    Here comes a simple technique for speeding up significantly the responsiveness of Tableau Desktop, hence increasing the productivity, and most importantly, reducing frustrations.

    1.Add an extract filter (or data source filter) over date dimension 
    Most large data sets are transaction records and thus have a date dimension. Or you can select any dimension that makes sense.


    2.Select a reduced date range
    As long as the reduced data set is representative, we are ok. The relative date filter can be set to last 2 years or last 6 months etc.

    3.Extract data
    Now you will have a smaller data set to deal with. The responsiveness of your Tableau should have improved significantly. If not, go to step 2 and select a even smaller date range.

    After you are done with design, just remove the extract filter.
    3

    View comments

  5. Replacing data source in Tableau has not been a pleasant task for many of us. I wrote about it a while ago:


    To minimize the pain, you can follow these steps. Before replacing, do these:

    1.Hold ctrl key to multi-select all calc fields, groups, sets etc (Right click and select Copy)
    2.Right click Dimensions or Measures shelf in the new data source, and paste all of them in one shot.
    A few notes:
    - You can do the above copy-n-paste after replacing data source. Those "!" will disappear after. Some Red pills will turn normal as Green or Blue.
    - Parameters are independent of data sources. No need to copy them.
    - Quick filters might turn blank after replacing or color legend gets double-over-typed. The remedy is:
    1.Click X on the frame to close it.
    2.Ctrl-Z to undo the above. 
    3.The filter will miraculously appear in multi-option form. Set the filter to single value type or whatever type you wish.

    PS. Proposed an idea to Tableau to remedy the pain:


    Please vote it up.

    3

    View comments

Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.