Monday, January 27, 2020

FAQ Series - Not all NULLS are created equal


Not all Nulls are created equal? - Really?

Nulls - they break your calculations and mess with your charts  but why are they difficult to fix?

There are 2  types of nulls each has a different solution

Empty Cell Null

The easiest to understand and to fix is the Empty Cell Null  in your data set.  In this simple data set it is easy to see there is an empty cell (a null) for Cats in the South




When the spreadsheet is loaded into Tableau and viewed on the Data Source tab it is still easy to see there is an empty cell



There is a record for the combination of South and Cats but there is no value in the record. Tableau will interpret it as a Null value -  





The empty cell null value is easy to fix with the ZN() function  (ZN for Zero Null)





No Record Null

What happens when the data is missing the entire record -  The preferred data structure for Tableau would be a tall and narrow set  like this



Now it is not as easy to see there is no data for South Cats  - This is an example of the second type of Null - the No Record Null - and it can't be fixed with using ZN()




There is no record South/Cats so there is no null value in a record to "zero out"

You can force a 0 in the viz using a lookup function  (NOTE this does NOT create a South /Cats record in the data set)


ifnull(lookup(sum(Quantity),0),0)



You are able to place a value into a view but other functionality is limited.  Lookup is a table calculation performed at the bottom (last) in the Order of Operation.  Table calculations areexecuted after all filters have been applied and all other calculations have been performed - The work on the underlaying table for the worksheet and are only performed at the aggregate level 

Lets see what that means in practice:

Running total is another table calculation and it can be applied to the Fill Null calculation and note that the Grant Total also returns the correct result




likewise Rank_Unique will return the correct Rank for a Fill Null 0











Most users want to total the data using Grand Totals - seems to return the correct value



But see what happens when we try to get an average the results do not meet expectations - averaging the quantity returns 80/3 for 26.7 not the 20 (80/4)  because there are only 3 marks (real records) in the viz-  The total on Fill Null can only be set to Automatic and returns the sum of the records





Using the Average in the "Fill Null" calculation will result in a total that is 26.7 (80/3) again









and the Lookup can not be used LOD expressions




and it can't create a record in the data table - Here the table doesn't contain South and Null Fill has no place to put a 0





So using a lookup formula to fill a No Record Null has its limit - What can you do

Padding or Scaffolding

You will need to create a record in the data (think like a cell in excel) that can be addressed with combination of dimensions - here South/Cats - to complete the data set - the technique is referred to a "Padding" or "Scaffolding"

It's not difficult but takes a couple steps - Make a separate file that list every combination of the dimensions you will need in the analysis - 




Upload the file to Tableau and Left Join it with your data - the missing records are added and the measure value is entered as a "Empty  Cell Null"  - 




The table is now complete and empty cell nulls can be used in any formula using the ZN() function 



ZN fills the cell with a 0 and the value can be used in any type of calculation - here a simple total using average - 




Obviously an elementary examples that illustrate that all nulls are not created equal

The next time you are face with a Null problem - think about which type you have 

If the data table is complete but there is an empty cell  wrap the calculation in ZN() 

If the data table is missing records and
 you need to place a 0 (or other value) in the chart you can use the Lookup table calculation    Ifnull(lookup(sum(measure),0),0)

But if the data set is missing records and you need to use a "No Record Null" cell value in other calculations then Padding or Scaffold is the choice
I hope it helped you better appreciate Nulls


Enjoy

Jim   

2 comments:

  1. A copy of dimension in an incomplete data set, can pad your data to do let you use the null format. It's a bit tricky though.


    ReplyDelete
  2. Yes it can - it is and yes it is tricky and maybe a bit advanced for the for the intended user of the blog - but thank you for the comment

    ReplyDelete

FAQ Series - Fiscal Date

Many businesses use a fiscal calendar for accounting and reporting . Their fiscal year can start on the first day of any month and may not a...