August 2008

You are currently browsing the monthly archive for August 2008.

I’m dedicating this post to the first person to compliment on the Excel Workbook blog…

Some time has passed since Ron‘s Beijing Olympics post.  It’s now 2020 and Laser Tag has become an official Olympics sport.  The globally famed Laser Tag champion Wang Y. blew her way through the games with amazingly acrobatic shooting, docking, and sniping skills.  After the thrilling final round where all athletes gave their 101%, you as the game admin recorded the scores as follows:

Download the Sample File

Having studied Ron‘s medal comparison model for the past three Olympics, you now have full appreciation for the neat user interfaces that Excel allows you to create.  Now you want to use that knowledge to create an entirely new and personalized score reporting system.  Here’s what you have in mind:

  • Scores will be presented in a bar graph
  • Each participating athlete will receive a personalized score card
  • Each athlete will be listed first on her score card with name spelled out on the graph, in honor of her excellent sportsmanship
  • All the other athletes will be listed in alphabetic order by country code; their names will not be shown

For example, the score card for China’s Wang Y. should look like this – her name, country, and score are listed on top, followed by the other countries in order: Australia, Brazil, (skipping China), Germany, and so on.

While this task could be done fairly quickly by hand, it could get very tedious.  You are a busy game admin after all, and don’t have the time to personalize 10 score cards manually.  You already know how to locate the chosen athlete, but how do you re-order the remaining list?

The INDEX() function comes in handy.  It returns the value of an element in a table or an array, selected by the row and column number indexes.  It often complements the MATCH() function, and can be considered as an inverse function to MATCH().  Note how it’s accomplishing the same task as the OFFSET() function used for Beijing Olympics.  Now you write:

=INDEX(
          [score array],
          CurrentItemPosition – (CurrentItemPosition<=ChosenItemPosition)*1
          )

The second argument determines whether to shift the reference by one row.  When China is selected (position 3), the 1st and 2nd countries need to be shfited into positions 2 and 3, but all other countries in the list will stay put for positions > 3.

Now you have a neat user interface that generates a unique score card as easily as a drop-down menu selection.  Many people resort to writing macros for this type of tasks, but they can more often than not be done with a relatively simple set of formulas.

The following is the score card for the USA athlete, who won silver.

Congratulations to all!

Download the Sample File

The Beijing Olympics games is all about the shimmer, shine and glamor: the Olympic Medals. If I were interested in comparing the number of Olympic Medals by country using Excel, how would I do this?

Download the Sample File

To make tasks such as this a bit more dynamic, I often like to create user interfaces. When creating an interface, the general rule of thumb is that it should be simple, allow the user to select his/her preferences and produce accurate results (both detailed and executive summaries).

Download the Sample File

To create this user interface, I will use the Excel functions MATCH and OFFSET to fetch data. This data is the basis to populate the graph and perform the simple difference calculation.

MATCH

  • =MATCH(lookup_value, lookup_array,[match_type])
  • MATCH returns the index / element position for the specified lookup value based on an associated array. The optional match type argument is FALSE if a lookup value must match an element within the array. TRUE or blank (default) in combination with a sorted array will return an index based on the range of values.
  • Ex, lookup_array = {1, 3, 5, 7, 9}.
    • MATCH(5, lookup_array, FALSE) = 3 because 5 is the 3rd element within the lookup_array.
    • MATCH(4, lookup_array, FALSE) = #N/A.  Since match_type = FALSE an exact match must occur which means that the lookup value must exist within the lookup array!
    • MATCH(4, lookup_array, TRUE) = 2. lookup_array must be sorted if match type is TRUE.
      • lookup values from 3 to <5 will return an index of 2,
      • lookup values from 5 to <7 will return an index of 3, etc, etc.

OFFSET

  • =OFFSET(reference, rows, columns, [height], [width])
  • Think of OFFSET in terms of Excel key navigation. Excel defaults at cell A1.
    • Want to get to cell B1? It is easy- simply press the right arrow or in other words, shift forward one column.
    • Cell A2? Shift forward one row / press the down arrow.
    • In Excel, OFFSET(A1, 0, 1) = B1 and OFFSET(A1, 1, 0) = A2. Therefore, OFFSET(A1, 2, 1) = B3.
  • OFFSET can also retrieve array / matrices of information when using the optional height and width arguments.

How does it work?  MATCH allows us to determine the position of each country and the medal type.  The MATCH results filter to OFFSET where Excel can retrieve the appropriate data from our table.

We have created this exhibit with just a few tools: Excel’s built in graph formats, conditional formatting, and the MATCH and OFFSET Excel functions. This type of model can have its various uses whether it is to create an exhibit to quickly compare financial ratios of equities, varying benefit plan design costs, and even Olympic medal counts.

Note: OFFSET and MATCH are not the only functions we can utilize to create the output, some of you may prefer MATCH with either VLOOKUP or HLOOKUP.

Download the Sample File

While arrays are great for grouping a system of formulas, their real advantage is to enable complex subtotals and lookups that ordinary Excel functions fail to support, through a technique known as the array sum or the conditional sum.

The general syntax is:

{=SUM(
     [condition 1]*
     [condition 2]*…
     [value 1]*…
     )
}

Note the brackets enclosing the formula.  To enter the formula as an array, press CTRL+ALT+Enter.

Each of the conditions and values is an array of corresponding size (equal size most of the time, but not necessarily).  The enclosing SUM() function and the array brackets serve as a SUMPRODUCT().  Note that

=SUMPRODUCT([array 1], [array 2], …)

is equivalent to

{=SUM([array 1]*[array 2]*…)}

The array sum multiplies the value array by one or more condition arrays, which contain 0’s and 1’s (FALSE and TRUE).  These condition arrays serve as filters (or masks) to zero-out values not corresponding to the specified conditions.  By picking up the remaining values this way, we can take SUMIF(), COUNTIF(), VLOOKUP() and other functions beyond one condition and one dimension.

Below is an example of box office sales records.  To make sense of this data, we’d often need to analyze subsets of the records.

Download the Sample File

With SUMIF() and COUNTIF() functions, we can easily obtain a subtotal for all sales in July.  We’ll practice using an array sum instead:

{=SUM(
     ([month column]=”July”)*
     ([sales column])
     )
}

Note that by omitting the [sales column], we’d be counting the number of records instead of summing their values.

Moving on, let’s get a subtotal for all June sales in Roanoke.  SUMIF() and COUNTIF() fail to be useful here.

{=SUM(
     ([month column]=”June”)*
     ([city column]=”Roanoke”)*
     ([sales column])
     )
}

Similarly, we can add a third condition (or in fact, as many as necessary):

{=SUM(
     ([month column]=”August”)*
     ([city column]=”New York”)*
     ([movie column]=”Transformers”)*
     ([sales column])
     )
}

In the last example, only one record satisfies all conditions.  The result is similar to that of a VLOOKUP(), except of course, we have a lot more freedom by using an array sum.  Sometimes, you might find yourself needing to specify more than one condition in a lookup and not wanting to double count any potential duplication of records.  What you can do is to use a MAX() function in place of the SUM() to limit the result to one record.

Here is a typical data snapshot summarizing the subtotals in a two-dimensional table:

Lastly, let’s see how an array sum can obtain subtotals (or single-value lookups) in such a two dimensional array.  The trick is to line up the dimensions of the condition and value arrays:

{=SUM(
     ([3×1 month column]=”July”)*
     ([1×3 movie row]=”Batman”)*
     ([3×3 sales table array])
     )
}

This article provides an introduction to the more advanced methods of using Excel formulas.  Once you get familiar with these methods, you will find a much more satisfying experience analyzing data in Excel.

Download the Sample File

 

Since many of this blog’s posts will be about array-based tricks, it might be a good idea to have a proper introduction to Excel’s array formulas.

By default, Excel’s functions and formulas take one or more input values and produce a single output value. Think of the typical multiplication and SUMPRODUCT() function.  When solving a more complex system of problems, however, it is often desired to process an array of similar values concurrently.  Hence the array formulas.

Download the Sample File

Let’s start with the real basics.  You’re the manager of a video game store, and for recordkeeping, you’ve just created a spreadsheet listing the prices of your inventory of video game consoles.  Now you want the identical price listing to appear on another part of the spreadsheet, and here’s what you can do:

  1. Select an area identical in size to the source data array (in this case, 3 rows x 1 column)
  2. Type the “=” to get a formula started, then drag to select the source data array
  3. Press Ctrl+Shift+Enter to complete your first array formua

Note that, in the formula bar, a pair of brackets {} enclose your formula.  This tells you that it’s recognized as an array formula.

Note: if you pressed the Enter key without holding down Ctrl and Shift, the brackets would not appear and only the first element in the array would be recognized.  Additionally, you cannot type in those brackets.

An array does not have to be a one-dimensional vector.  You can also reproduce an entire table by employing a two-dimentional array (3 rows x 2 columns):

So what now?  Pretty boring and useless, huh?  For this type of tasks you would typically use a simple “=” formula and copy it across the table.  In fact, we’d usually recommend that as well.  But this approach has its advantages – because the entire array (1D or 2D) is produced by one formula, you cannot change any of its elements (try typing over any of the cells… Excel won’t let you).  When processing large amounts of data, it’s not always easy to check for human errors, i.e. if your coworker’s son messed up one of your 5,000 rows of identical formulas.  Array formulas help ensure data integrity.

Moving on… now it’s the end of the month and your intern compiled the sales records for the store.  But he listed the consoles horizontally on three rows instead of vertically!  To fix this, you use the TRANSPOSE() function to “rotate” the table… turning rows into columns and columns into rows.  This is a very handy function that’s strictly used with arrays.


To turn this 2×4 table into a 4×2 table


Use the TRANSPOSE() function… don’t forget, Ctrl+Shift+Enter!

Okay, before we wrap up this post, let’s look at something actually useful, i.e. more than simply duplicating values.  Given your price list (a 3×1 array) and your intern’s sales records (a 1×3 array), you now want to calculate the total revenue.  SUMPRODUCT() comes to mind since you want to multiply the price by unit for each product, and add them all together.

However, SUMPRODUCT() requires all of its input parameters to be of identical size.  If you simply entered in the row array and the column array, you’d get a #VALUE! error.  You could TRANSPOSE() one of the two arrays to match the dimension of the other, or you could simply use TRANSPOSE() within your SUMPRODUCT():

{=SUM([array1], TRANSPOSE([array2]))}

Note the brackets – don’t forget to press Ctrl+Shift+Enter.  This calculation gives you the total revenue by first turning the two arrays into the same dimension, and then apply the SUMPRODUCT() function.

Download the Sample File

In the next article, we’ll talk about array sums, which lead into a whole wide world of possibilities.

Let’s recap.

  1. You want to learn about Excel.
  2. You want to buy computer parts to build a PC.
  3. You research the prices at Best Buy and Circuit City.
  4. You research the prices at more stores.

You have optimized the computer part prices and determined that the lowest total cost is $640.63. Before you pat yourself on the back for your hard work, you must realize that although cost is important, it means little unless you determine where to shop. The following Exhibit cleanly indicates the best price option and its respective store.

Download the Sample File

One option to create the exhibit is to use a manual “select cell, highlight, change font” (rinse, repeat, recycle many many times). This manual process may be fine for small projects, but the task can quickly become tedious if information changes or expands quickly.  Let’s create the exhibit using Excel’s highly powerful and usable Conditional Formatting (Excel 2003: Format > Conditional Formatting).

  • If an additional “cheaper” column is available, this task is easy. Choose the appropriate cells and set the “Cell Value Is” to refer to the Cheaper column.  (In the example below, column I refers to the before mentioned “cheaper” column.)  I have chosen the “equal to” option, but there are many other options that are easy to understand.

Download the Sample File

  • Now, lets make the assumption that you have no interest in creating an additional “cheaper” column.  No problem.  Instead of the default “Cell Value Is“, change the condition type to “Formula Is“.  This allows you to write an Excel formula which must solve to True or False (or equivalent).  In the example below, columns C through G refer to prices specified in Best Buy through Walmart.

Download the Sample File

Conditional Formatting is a very useful tool. When working for a financial firm, I used Conditional Formatting to error check, show real time changes in Equity Prices (Bberg BDP function), and highlight cells for concern.  What is really nice about Conditional Formatting is that you can utilize all the popular Excel formulas.  In many ways, you can create instant feedback for your spreadsheets because you can set up cells to format automatically based on your input/formulas.  Unfortunately, there are restrictions.   Excel 2003 only allows 3 conditions, the size of the file will increase, and large spreadsheets can become noticeably slower.  Microsoft expanded this tool in Excel 2007 and offers >3 Conditional Formatting formulas and more intuitive default conditions.  I have yet to explore all the Excel 2007 additions, but perhaps this could serve as a post in the future!?  Thanks to Peter for his original post.

Note*  If you need to build a computer, do your own research.  These prices most likely do not reflect actual store prices!

Download the Sample File

Lets revisit Summing Element-Wise Minimum once again. To recap:

“Suppose you plan to buy several computer parts from either BestBuy or Circuit City. The two stores have different prices for each product. If you buy from one store exclusively, you won’t get the best deals possible. To optimize the value of your hard-earned dollar, you want to find the least amount of money necessary to buy one of each part.”

Lets suppose you are not convinced you have truly optimized your hard-earned dollar. Therefore, you decide to add prices for Office Depot, Duane Reade and Walmart.

Download the Sample File

An easy way to solve this problem would be to create a new column which utilizes the MIN() function. But, since you believe in optimization you want it ALL in one single cell. What are your options? I guess you can start creating a crazy array that will be difficult to understand such as the following example which determines the cheapest total price between 3 stores (arrays):

{=SUM(IF([array 1]<[array 2],IF([array 1]<[array 3],[array 1],[array 3]),IF([array 2]<[array 3],[array 2],[array 3])))}

Looks… um… um… right. So outside of adding a column and creating a crazy array formula, what other option is available? Enter Microsoft Visual Basic! It seems that many Excel based users may go pretty far in their careers without ever understanding the super hyper intimidating but useful tool. (Excel 2003 keyboard shortcuts provided)

  • Tools > Macro > Macro… (Alt + F8) and
  • Tools > Macro > Visual Basic Editor (Alt + F11).

So now, wouldn’t it be nice if we could just write a function and refer to a bunch of cells and get exactly what we want? Guess what? With Visual Basic, you can… With the Visual Basic code below, you can add as many stores as you like while solving for the optimal price.  It is important to note that everyone has their own programming style, you can feel a glimpse of my style with the code below.

Download the Sample File

I certainly encourage you to download the sample file because I have provided comments within the Visual Basic Module. That said, I do want to make a few quick notes.

  1. A Visual Basic function (if non-Private) can be called from the Excel Spreadsheet directly. The spreadsheet contains the formula, =minSumItNow([Array 1 … Array N]).
  2. Visual Basic Function differs from Sub because Function must return a value. The value is set by assigning the Function name.
  3. [Array Name].Rows.Count and [Array Name].Columns.Count are very useful. This allows the user to understand the size (vertically and horizontally) of arrays. This is useful because users do like to include a range of cells as an input. (For Each is useful as well and will be discussed in a later post)
  4. Visual Basic might seem like more work to learn, but the rewards are plentiful. Learn Visual Basic to:
    • Impress your colleagues.
    • Work more efficiently.
    • Automate tedious tasks.
    • Unlock the true power of Excel.

Download the Sample File