FORMATTING RAW WELLNESS DATA BASED ON ROLLING 28 DAY Z-SCORES

This latest download is something I have been working on for quite some time, having built some similar functions for projects before, but without a 28 day rolling average, standard deviation and z-score.

Using Z scores is an effective way to monitor significant deviations away from baseline values for any metric you are measuring.

This spreadsheet demonstrates how to calculate an individual rolling 28 day mean, standard deviation and subsequently, Z scores from a bank of wellness raw data.

Screenshot (109).png

The spreadsheet also uses conditional formatting to highlight raw wellness scores for a given day, based on the rolling z score for each athlete and wellness variable. This is a useful trick as raw scores are not very useful in isolation, but they may have deviated significantly below baseline, which the z-score will help highlight.

I’ve set up some mock wellness raw data for 18 athletes, with a date range that runs up to 2018, for those who may only come across the spreadsheet until then!

Screenshot (72)

I’ve set up a pivot table in the format below in a separate tab based off this raw data. As you can see it is quite widespread, with the dates going across the columns.

By clicking on the filter button just above first date in the table, we can narrow the dates into a more manageable data set, and just display today’s date.

Screenshot (103).png

Screenshot (102)

Screenshot (98).png

The next step is to set up some extra feeder columns in the wellness raw data tab, with an equation that only displays data for the last 28 days.

To do this we will first create a True or False column. This will tell excel whether the date of all the wellness data, falls within the last 28 days before the date that is displayed in the pivot table (today’s date).

The formula uses the AND function, and only displays the value if all conditions for TRUE are met:

=AND([@Date]<‘Daily Wellness’!$B$4,[@Date]>=’Daily Wellness’!$B$4-28)

This formula tells excel to display True or False, if the date in the column B2 is within 28 days before the date at the top of the pivot table in the Daily Wellness tab.

Screenshot (105)

This is a useful function, as if you wish to go back to view a certain day using the pivot table, then the equations will follow suit, and display a TRUE for the previous 28 days from this new date.

It also means that the z-score calculations will coincide with this date change, so you don’t end up with a raw wellness score and formatting, based off calculations for a different date range.

In this file, I have named the date column Date, hence when I click on cell B2 when creating this formula, the [@Date] appears in the formula box instead of B2. Since the value in B2 below is the 18/04/17, that falls outside 28 days before today’s date so it returns a FALSE.

Screenshot (95)

Now we can look at an array formula that will display the wellness variables for each athlete, provided there is a TRUE value in the 28 Day column. The formula is as follows:

=IF([@[28 Day]]=TRUE, INDEX(Data, MATCH($B2&$C2, [Date]&[Athlete Name],0),MATCH(Wellness[[#Headers],[Mood ]],Wellness[[#Headers],[Day]:[Muscle Readiness]],0)), “”)

This formula tells excel that if the 28 day column contains TRUE, then it should look in the series called “Data”, a name I have assigned to the series of cells from A2:H5185.

In this series of cells, I want excel to find the values for B2 and C2 (the date and athlete name), look for them in the entire columns named Date and Athlete Name, and then match them with the corresponding value in the Mood Column.

The Mood heading must be searched for in the Wellness headers that span from Day to Muscle Readiness, to get the correct match.

Finally, the two “” at the end of the formula, tells excel to leave the cell blank if there is no TRUE in the 28 Day column. The formulas we use to calculate rolling 28 day averages do not recognize blank cells, so this part is key.

Screenshot (96)

Below shows how the equation displays the wellness data when there is a TRUE in the 28 Day column, and how all the other cells corresponding with FALSE are blank.

Screenshot (97)

Once this is set up we can start to plug in the average and standard deviation formulas, to help form the rolling z-scores.

To calculate the average, we will use an averageifs formula to lookup the athlete name in column A, match it with the names in the athlete name column in the wellness raw data tab, and get an average of all visible data in the 28Day Mood column for this athlete.

This is the value of having the “” at the end of the last equation, because even though we select the entire Mood range, the averageifs formula will not include blank cells in its calculations. Thus, we have a 28 day rolling average calculation:

=AVERAGEIFS(Wellness[28DayMood], Wellness[Athlete Name], A3)

Simply drag this formula down and do the same for each wellness variable by changing 28DayMood to the 28DaySleep Quality column etc. This will calculate the average for this variable for each athlete. Then we can move onto rolling standard deviations.

Screenshot (79)

Screenshot (80)

The standard deviation formula works the same way the averageifs formula does, but is slightly different in structure.

We will look up the athlete name in column A, match it with the names in the athlete name column in the wellness raw data, and run a standard deviation on all visible data in the 28Day Mood column.

{=STDEV.S(IF(Wellness[Athlete Name]= Calculations!$A3,Wellness[28DayMood], “”))}

Since this is an array formula, as indicated by the curly brackets at the end, you need to hit ctrl+shift+enter to complete this formula. Again, the formula will not include blank cells in its calculations.

Once we have both the average and standard deviation for each athlete and variable, we can calculate the z-scores.

Excel has a useful built in formula to calculate z scores, the STANDARDIZE function:

STANDARDIZE(x, mean, standard_dev)

All you need for the standardize function is:

X – (the raw wellness score for today)

Mean – (average of the last 28 days for wellness variable)

Standard_dev – (standard deviation of the last 28 days for wellness variable)

Since we have already calculated the average and standard deviation for each athlete and variable, all we need is the raw wellness score for today to complete the calculation.

Assuming the order of our athlete names in the pivot table is the same as the layout for the average and standard deviation calculations, then we just pick a cell within the pivot table, where each wellness variable is located, to act as the X in the STANDARDIZE formula.

So, the STANDARDIZE formula will look as follows for athlete 01 and mood:

=STANDARDIZE(‘Daily Wellness’!B6, Calculations!B3, Calculations!G3)

Screenshot (107).png

Continue with the same formula for all variables and athletes to get a full z-score profile. Then, apply some conditional formatting to make things clear and simple (this will also help when comparing formatting between raw scores and z-scores later!

Screenshot (106).png

I have formatted to the z scores based on three values:

Orange –  Cell value between -0.00 and -1.49

Red – Cell value below -1.5

Green – Cell value above 0.5

Screenshot (88)

The final part of this sheet will be to conditionally format the raw data scores, based off the z-scores in the other tab.

Screenshot (86)

Conditionally formatting a cell based off another cell, just requires using a formula to dictate how to format the cell of your choice.

For all cells containing a z-score below -1.5, I want the cell to fill red, so for Athlete 01 and Mood score, it’s as simple as:

=CalculationsL3<-1.5

L3 being the cell in the calculations tab, with the Mood Z Score for Athlete 01.

By using the “apply rule to” section of the edit formatting rule window, we can select cells B6 to F23, which will format all scores in each column for each athlete, based off the same formula. This avoids reapplying the same formula for each athlete and variable!

Just make sure L3 doesn’t look like this – $L$3, this means this cell is locked and the formula will only use the value in L3 instead of moving the value to L4 for athlete 02 etc.

Excel will automatically put the $ for each cell, so just delete these signs to make sure the formula just says L3, as you will see in the picture below.

Screenshot (90)

Next, we need to format the cells to turn orange when the z-score cell value is between -0.00 and -1.49.

Screenshot (89)

Conditionally formatting another cell for values in between -0.0 and -1.49 requires a small formula. We will use the AND function, to create a formula with two parameters, a value less than -0.00 and greater than -1.49 means anything in between these two numbers will flag as orange.

=AND(Calculations!L3<=-0.00,Calculations!L3>=-1.49)

Again, we need to make sure L3 doesn’t look like $L$3, so we can apply the formula to the full range of raw scores again.

Lastly, I just need a simple formula to fill all cells with the colour green, when the value is above 0.5, so it’s as simple as:

=CalculationsL3>0.5

Screenshot (91)

I like to copy and paste the conditionally formatting z-scores, beside the pivot table as a quick reference to make sure everything is in order, especially when I come back to the file a few days later to see if it updates.

Screenshot (104).png

As you can see the pivot table is updated and the z-scores match in terms of formatting, which is exactly the way I want it! I can cut and re-paste the z-scores back into the calculation column and leave myself with just the pivot table as my dashboard once again.

Screenshot (93)

Hopefully this helps give some context to raw data in excel by using z -scores to determine the significance of a raw data change, and save some time when analysing wellness data!

Click here to be taken to the free downloads page to get access to this spreadsheet

Enjoy!