Monday, March 23, 2020

FAQ Series - Order of Operations

My blog has moved to jimdehner.com 

Thank you for all the support over the past 2 years - I recently updated my blog to make it easier to read, improve the rendering of images and tie back to video presentations that are associated with the content - Check it out at  Jim's new blog  - 


So Just how does the Order of Operations affect your calculations?

Tableau, like most systems has an order or sequence in which filters are applied and calculations are executed.  The process starts at the top and proceeds to the bottom in a single pass.  Calculations executed lower in the order depend on all the steps that precede them.  









Steps 1 and 2 - Extract and Data Source Filters

First just a quick work on the first 2 filters - Extract and Data Source work as data is loaded from the source and limit the amount of data available workbook. 

The remaining filters are executed in the context of a worksheet and limit the data that is included in the table that underlays that sheet - filters on a worksheet can be applied to other sheets to limit the underlying table on those sheets also.


Step 3 - Context Filters:

Placing a dimension "In Context" will apply the filter before any calculations are executed, the Top N is determined or Sets are formed. - The data associated with the filter is not on the worksheet's underlying table and is therefor not available to remainder of the order of operation.

Just open the drop down and select Add to Context





Step 4 Sets, Fixed LOD and Top N

I'm going to use this simple table in the next few examples to show the affect of context filters



After adding Total Cost =  (Price*Quantity) and an LOD: { Fixed Type :sum(Total Cost)}

A text chart like this would result with no filters applied - and the LOD value for Fruit is 5.75






Now see the Context filter effect on the LOD - 

Start without the Produce Name in Context, Banana has been filter out but the but the LOD value remains unchanged - the LOD is calculated before the dimension filter is applied




Place Produce Name in Context 

Banana is filtered out in the Context filter before the LOD is calculated so the count of values in Fruit is now just 2 and the LOD total is reduced to 3.75 - 



Sets are affected by using a context filter also.  The set will contain the Top 3 Produce Names




With no filters applied and nothing in the Context filter,  Apple, Banana and Green Beans are the Top 3





If we filter out Banana without placing Produce Name in Context only Apple and Green Beans are in the result - 2 Produce Names - Not 3?  

The Set is determined on the entire data set - before the Dimension filter is applied - Later, after the Set is formed Banana is filtered out 

But if Produce Name is placed in Context and Banana is filter out before the set is formed 3 values are returned Apple, Green Beans and Spinach (Remember Banana is no longer in the data table for the worksheet)





Let's see what happens with a Top N Filter

Start with a filter on Produce Name set for Top N =3





Create a viz and add a filter on Type - with no filters applied and nothing in Context.

The Top 3 are Apple, Banana and Green Beans 




Filter out Vegetable from Type but do not place in Context



And only 2 values are returned Apple and Banana - the three Top N values were determined  before the Dimension filter is applied - and included 1 vegetable - Green Beans.

Later when the Dimension filter filters out  Vegetable there are only the 2 Fruit values remaining in the Top N

Bu when Vegetables are filter out in Context, the  Top N are determined only from Fruit and 3 values are returned - Apples, Bananas and Grapes - 




Step 5 - Dimension Filters

Dimension filters come in a discrete and a continuous version - The discrete dimension filter should feel familiar if you have ever used a filter on a spreadsheet. The process is the same - just select the dimension values to include in the view -  

Using the Superstore data set that came with Tableau 

Start with a simple bar chart of sales by year and segment - no filters applied 




Apply a filter on Segment and the segment is removed from the view - 



The dimension does not need to be on the rows or the columns in the view - the data will will be filtered from the data table and the view adjusted




Dimension filters can also be applied to continuous data - Here a range filter with start and stop dates to a date field  


Or as a relative filter - like the last 6 Quarter shown below



Dimension filters are applied to the underlying table for the worksheet  after the context filter and set, top N and Fixed  LOD calculation noted earlier.   After they have been applied the table has been set and further calculations or filters are only applied to values in the table .

Step 6 - Data Blending 

The topic is much too broad for a full discussion here but note where Blending occurs - After Fixed LOD's and dimension filters have been applied. Blending results in data that is aggregated at the level of the link (relationship) among the data sets - but further dimension filtering or use of LOD's across the blended data sets is not available.


Step 7 - Include and Exclude LOD's

Include and Exclude are the less frequently used pair of LOD expressions -

Based on their position in the Order of Operation they operate on the data table that results after the application of all Dimensional and context filters -

So if the table is what what do they do

Include and Exclude refer to the visible portion of the table that is in the Viz - Include will use dimension in the table NOT visible in the viz into the calculation  - Exclude will not use dimensions visible in the view -- 

Lets see how they work:

For these examples, we will use a small product sales data set - 2 Customers, 2 Product Lines, 4 Products and a variety of colors




Add an Include LOD at the Product level - 



When the viz includes the Product level and lower the average from the LOD and a simple average on quantity return the same result



As the hierarchy is compressed the value from the LOD is fixed at the product level



and  return the average at the product level even when it is no longer in the view




Exclude will eliminate the dimension from that calculation and the expression will be based on the next higher level in the view

With the same data set and




will return this when the detail is below the level of Product  - the same a simply summing the quantity




but as the hierarchy is compressed the values is locked at the Product level




and remains at that level 



Step 8 - Measure Filters

Measure filters limit the view based on the values in the data table (not the categorization created by the Dimensions) and can be applied to the aggregate level or the row level data.

When you drag the measure to the rows shelf a window will open giving you to filter at the Row Level "All Values" or at any of the noted levels of aggregation - 


Continuing with the previous data set and filtering for the Sum(Quantity)  - a continuous range filter will open - 



Change the filter and any total value form the view that is not in the filter range is filtered out




when the hierarchy is collapsed to Product Line the only 2 totals that in the range are 14 and 19 - the 94 and 62 are filtered out of the viz




Now, look at how the Row Level Measure filter produces a different result

The filter in NOT an aggregate (no SUM() around the measure)
But for this view the result is the same as the aggregate example



Collapse the hierarchy as before the results are different - and 2 totals 37 and 55 are not filtered out -  Why?  - 

The filter is applied at the row level in the data table and all the values between 10 - 35 are included - Then when the measure is brought to the viz they are summed - 




Step 9 - Grand Totals 

Much like Blending Data - Grand totals are a large topic and will be addressed in a later posts - But note that Grand Totals are before Table Calculations - 

Step 10 - Table Calculations

As the name implies, Table Calculations are applied to the underlying table for the individual worksheet.  At the bottom of the order of operations, they work with the fully filtered data and are always aggregates.  They also can't be used in calculations that precede them in the order of operation - e.g. Table Calculations, Sets , or Top N - 

They are extremely powerful tools used to compare data, total, look up or provide running sums across or within the data table -understand their position in the order of operation is at the bottom and the data table has been set. Using Table Calculations will be the subject on another FAQ so come back.



Hope this helps provide a better sense of what the order of operation is and how it is important when creating your viz

A workbook containing the examples used here can be found and downloaded from 
Link to worbook

Presented at Pittsburgh TUG 12/17/20 see Video link  approx 45 minute mark


Enjoy and let me know if you have questions
Jim

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.