July 2008

You are currently browsing the monthly archive for July 2008.

Peter explained an useful array based formula in his post. I won’t repeat what he was trying to accomplish, but a piece of his post really got me thinking. So for my very first, real post, I will blog about a real issue that many of us will face at one point.

Here’s the scenario. You are the valedictorian from the USA Today’s #1 ranked Liberal Arts University in Northern Kentucky. Your summa cum laude in Art History has taken you to the most prestigious consulting firm, ABC Consultants LLC (“ABC”). Your manager, Peter, is very excited to have you on board and tells you about a really exciting project that he worked on last year. It deals with the super-innovative and enterprising firm that is breaking new grounds everywhere it turns. This firm is known as XYZ Corporation (“XYZ”). Peter is really proud of the work he did last year (after all, XYZ renewed its contract with ABC, so why not be proud Peter?) and directs you to a spreadsheet named “XYZ Model.xls“. He wants you to take a look at the model and understand it by tomorrow morning so you can have a long conversation with XYZ’s CEO about its methodologies. Before you leave, Peter mentions:

“XYZ Model.xls may have a ‘convoluted approach and a super long formula‘”.

Here’s that formula:

{=SUM(C4:C12*((C4:C12-D4:D12)/IF(C4:C12=D4:D12,1,
ABS(C4:C12-D4:D12))-1)/(-2)+D4:D12*((C4:C12-D4:D12)/
IF(C4:C12=D4:D12,1,ABS(C4:C12-D4:D12))+1)/2)}

Array formulas are great because you can cram a ton of information into one single cell. The problem is, they are often difficult to understand because they can be very involved! So where would you begin?

1) Organize the array formula. When programmers code, they love to use indenting to help organize their thoughts. Let’s use a similar approach:

{
_____=SUM
_____ (
__________C4:C12 *
__________(
_______________(C4:C12-D4:D12) /
_______________IF(C4:C12=D4:D12,1,ABS(C4:C12-D4:D12))
_______________-1
__________) /
__________(-2)

__________+

__________D4:D12 *
__________(
_______________(C4:C12-D4:D12) /
_______________IF(C4:C12=D4:D12,1,ABS(C4:C12-D4:D12))
_______________+1
__________ ) /
__________2
_____)
}

2) Look for patterns. Many Excel array formulas process similar cells of information. Let’s define variable lists A through E as the following:

A = C4:C12
B = D4:D12
C = C4:C12-D4:D12
D = IF(C4:C12=D4:D12,1,ABS(C4:C12-D4:D12))
Let’s also have … E = C ÷ D
_____Note (if we want to go wild)*
_____C = A – B, D = IF( A=B,1,ABS(C) )

3) Now that we have organized our formula and defined variables, we can combine the results from Steps 1 & 2 together to create a cleaner formula.

= ?(A x (E – 1) ÷ -2 + B x (E + 1) ÷ 2)
= ½ x ?(-A x (E – 1) + B x (E + 1)) | E = C ÷ D

4) The formula seen in Step #3 looks complex, but to me it looks a lot simpler (and less confusing) then the original formula!  Let’s take advantage of Excel and expand out the array formula.

Download the Sample File

5) Great.  Now that we have confirmed our result with the original array formula’s result, we are done, right?  No way!  Remember Peter’s initial request.  He would like you to be able to  hold a conversation about the model.  Let’s understand the formula!

= ½ x ?(-A x (E – 1) + B x (E + 1)) | E = C ÷ D

  • C is simply the difference between A and B.
  • D is the absolute value of C or 1 if C is zero.  Why is D not just zero when C is zero?  I will tell you soon.
  • E as mentioned is C ÷ D.  This results in E becoming either -1, 0, 1.  Back to our non-zero D riddle above.  Since E is C ÷ D, dividing by D = 0 would result in a #Div/0! (Divide by Zero) error.  Mathematically speaking, anything divided by zero gets everybody confused!
  • Now we know that E is -1, 0 or 1.  Since we have calculated E, lets forget about arrays C and D.
  • Let’s take a look at each scenario.
    • If A < B Then E = -1
      • = ½ x (-A x -2 + B x 0)
      • = ½ x (-A x -2)
      • = A
    • If A = B Then E =0

      • = ½ x (-A x -1 + B x 1)
      • = ½ x (A + B) |  A = B
      • = A or B (it does not matter!)
    • If B < A Then E = 1

      • = ½ x (A x 0 + B x 2)
      • = ½ x (B x 2)
      • = B
  • Based on the results above.  Peter has duplicated a Min like function for A and B.  This formula does achieve what Peter was originally set to accomplish “Summing Element-Wise Minimums“.

6) If you become completely confused by the formula, removing yourself from the spreadsheet will often clear your mind.  Do your best to understand what you can, but do not spin your wheels!  Don’t be shy.  Have a conversation with your manager (or a knowledgable colleague) to either confirm or gain further understanding.  As you gain practice and experience with array formulas, you will begin to understand these formulas quickly.  Just be sure when you create a model with complicated array formulas, document the process to avoid further confusion!

Download the Sample File

This article discusses an issue on rounding error, specifically related to breaking down a total amount with a trend/inflation rate.  It’s more of a math problem than an Excel problem, and if you work in the healthcare industry, this problem probably will look familiar to you.

Note: “Trend” here is assumed to be a compounding annual cost inflation.  When “trending” an amount from one period to another, the length of time is measured between the mid-points of the two periods.

Suppose you made a projection for your client’s healthcare spendings to be $12 million in the upcoming calendar year, assuming 10% annual trend.  Now the client wants to break that down to monthly figures.

The answer is not $1 million for each of the 12 months.  The force of trend makes this spending an increasing function over time, i.e. February should cost slightly more than January, and so on.  The industry-standard formula is to apply the 10% trend, from the mid-point of the year to the mid-point of each month, i.e.

Monthly Amount = Annual Amount / 12 * (1 + Trend) ^ ([mid month] – [mid year])

While this is a perfectly valid and defendable approach, the results always have a problem – the “rounding error”.  As seen in the table below, adding the 12 months of breakdown back together produces a slightly higher sum than anticipated.  This error is usually small ($4,511 in this case, < 0.1%) and people tend to overlook such insignificant imperfection.


Numbers don’t add up

Note: This type of difference is widely known in the industry as a “rounding error”, but technically it isn’t.  Rather, it is a methodological inaccuracy.

But not everybody can tolerate imperfections – certain demanding bosses/clients could get very upset if your basic numbers don’t add up!  To correct this situation, I’ve seen people throw that $4,511 in one of the months, break it out evenly across the 12 months, or use some GoalSeek magic to add a margin to each number until the total comes out right.  None of these methods is pretty.

Take a step back and think like a Calculus student again… the trending cost can be represented as the area under curve of an exponential function, and the aforementioned approach to monthly amounts is a discrete numerical approximation of the integral!


The pink bars are the results of the previous method, which closely but not perfectly approximate the area under the blue (exponential) curve.

Now, if you can take a basic anti-derivative from point a to point b (one month to another), you’ll have a solution that’s better than a good approximation:

Monthly = Annual / Trend * ((1 + Trend)^(i/12) – (1 + Trend)^((i-1)/12))

The resulting formula is difficult to understand when you look at it for the first time.  However, once you understand the concepts, it’s very easy to explain.  The monthly breakdown sum up beautifully to the exact $12 million, so you won’t have to do any unsightly tweaking!


Numbers add up perfectly!

Download the Sample File

The sample file contains a hand-written derivation of the formula.

 

I finally felt like I was able to figure out Excel 2003.  I knew all the keyboard short-cut commands without even needing to look at the keyboard, was proud of my ability to convey data in fancy graphical representations, and was getting very comfortable with building Visual Basic macros.  Yes, with regards to Excel 2003- I felt absolutely dangerous.

Now its all different.  Here on my desktop is the brand-spanking new Microsoft Excel 2007 (right, it is not that new anymore considering it is already 2008).  Gone now are the familiar menus I once knew and replaced with a seemingly more complex simpler ribbon system.  It feels new to me, and I seem to work really really slow.  That makes me scared.  Very scared.  My goal is to become proficient and speedy with the newer Excel 2007.  In the process, I hope my learning becomes yours as well.

It is important to note that Excel 2007 has a NEW format unrecognizable by Excel 2003.  Does this mean that Excel 2003 users should completely avoid my posts?  Nope!  Since not all of my concepts will focus on the newer excel, the 2003 “.xls” will be available.  I will be sure to make this distinction!

How do you use an array formula to sum up element-wise minimum values?

Suppose you plan to buy several computer parts from either BestBuy or Circuit City.  The two stores have different prices for each product.  If you buy from one store exclusively, you won’t get the best deals possible.  To optimize the value of your hard-earned dollar, you want to find the least amount of money necessary to buy one of each part.

99.5% of the time, you’d set up a spreadsheet similar to the one shown below.  By using a MIN() function, you create a third column listing the lower price of each item.  Then it’s a no-brainer to add up the column for the total.


Download the Sample File

Occassionally, you may want/need to calculate this number in one single step.  Array formulas came to mind and it seems intuitive to write {=SUM(MIN([array 1], [array 2]))}.  However, that doesn’t work because MIN([array 1], [array 2]) does not give you an array of element-wise minimum values… instead, it gives you a single minimum value between the two arrays.

To fix this problem, you can employ the IF() function within your array formula.  This is short, clean, and easy to understand:

{=SUM(IF([array 1]>[array 2], [array 2], [array 1]))}

Or if you prefer to use the mask/filter approach, your formula could look like this:

{=SUM([array 1]*([array 1]<[array 2]) + [array 2]*([array 1]>=[array 2]))}

The ([array 1]<[array 2]) part creates a mask of 0’s and 1’s, which is used in a similar manner to the typical SUMIF-type of array formulas.  The two approaches use the same logic, and the only difference is the appearance.

A problem is always simple when you already have the solution, but believe it or not, I came up with the most convoluted approach and a super long formula when this problem was first presented to me.  I’ve included that formula in the sample spreadsheet – see if you can figure out what I was trying to do!

Download the Sample File

My intern came to me today and asked about displaying multiple lines within a cell.  I’ll write the first real post in his honor.

Most people know that when entering text in Excel, pressing ALT+ENTER makes a new line.  However, he needed to generate this multi-line string dynamically (via a formula).  The new line characters, carriage return (CR) and line feed (LF), are not as obvious in Excel as most programming languages make them.

Note: This refers to the ‘\r\n’ in C-styled languages and vbCrLf in Visual Basic. In many environments, however, only the latter (‘\n’ and vbLf) is necessary to create a new line.


Download the Sample File

One way to do this, which may be more intuitive to non-programmers, is to use the same ALT+ENTER key stroke by entering this formula:

= [first line] & “[press ALT+ENTER here]” & [second line]…

Another way to do this, which is longer but probably preferred by programmers, is to employ the new line characters (CR + LF):

= [first line] & CHAR(13) & CHAR(10) & [second line]…

Similar to aforementioned note, only CHAR(10) is necessary for the desired results.  Add CHAR(13) only if you have the pet peeve from certain programming styles.

This entry may sound obvious, but here’s the catch: by using the formula above, you probably will still see all your text concatenated on one line.  You still need to wrap text (Format -> Cells -> Alignment tab -> Wrap text) in order to see multiple lines!  Note: Excel automatically sets the wrap text property when you enter plain text (instead of a formula) with a line break in a cell.

Download the Sample File

Tags: , , , ,

When I was in college, I regarded myself as an Excel expert.  But as soon as I started getting paychecks in the “real world”, I stopped being so naive.  My Excel knowledge doubled, tripled, and went up at least 10-fold within three years.  Do I consider myself an advanced user now?  Sure.  Is there more to learn?  Of course.

Like other professionals whose worlds revolve around Excel, I gained most of my knowledge through solving topical problems, and I frequently found help through Google on blogs/forums/websites run by Excel experts.  Now that I’m much more comfortable with this software, I thought that it’s time to give back to the community.  Hopefully one day someone will be Googling a specific Excel question and find a solution here.  Then I’ll be glad.