Tips for Technologists #7: Excel with Excel

In Tech Digest by Nick Ruffilo

Tips for Technologists is a series aimed at teaching you to engage with technology in best way possible. You can see all the Tips for Technologists articles here.

By Nick Ruffilo

This tutorial is geared towards all expertise levels and people in nearly any role. Marketers, business analysts, salesperson, librarian, coder, nearly anyone can benefit from the lessons below on maximizing the use of Microsoft Excel.

Quick note on Apple Numbers

While Apple Numbers is the Mac “equivalent” of Excel (spreadsheet application), I’ve found it to be deficient in quite a few areas. I will also admit that I don’t know it nearly as well as Excel, so if someone is a Numbers master and would like to write the equivalent tutorial for Numbers as a guest post, I’ll be happy to see that it gets run.

Excel is a Data Visualization Tool

You may think of Excel as a spreadsheet tool, but it is far more than that. It is, in my opinion, the best all-purpose data visualization tool. It is the only Microsoft Office tool that I’ve consistently used through my career and still use today. For any data set of 32,000 records or less, it is one of the quickest way of creating ad-hoc reports and handling data revisualization. It is not a perfect tool, it can be a performance hog, and it crashes from time to time, but all those flaws are worth overlooking. I will do my best to explain to you how to easily unlock the power of Microsoft Excel.

A Note About Versions

I’ve been using Excel since Office 97. Nearly every feature I will mention (except pivot tables, which have been in since XP I believe) has been around since then. There are some new features, but everything should work in all versions. Something to note — some functions have changed version-to-version, so if an example doesn’t work, do a google search on how to do that specific function in your specific version. Most notably, vlookup seems to change version to version.

Getting your data right

The first step in maximizing Excel is getting your data in a usable format. For all my examples, I’ll use the following dataset (pictured below):

Its a fictional sales chart about sales of a book. This is often how you’ll get your data: sales date, an ISBN, a channel, and a price. While this is useful, and you can extrapolate some useful charts from it, with a few simple excel functions, we can add immense value. We’re going to add a few new columns and we’re going to give them the values Day, Month, Year, Book Type and Weekday.

The Day, Month, Year and Weekday will all be done with a simple function. To use function in excel, type “=” into a cell, then the function after. We’ll start with day. Type “=day(a2)” into your E2 field and hit enter. The value of 1 will appear (will give you just the DAY number of the date). The next trick is a huge timesaver. Click the field you just entered the formula on, then hold shift and select the field (same column) at the bottom of your data set and hit ctrl-d. That will fill every row in that column with the function you just typed.

In the next few columns (for month, year, and weekday), repeat the same function but use: =month(a2), =year(a2), =weekday(a2). If you don’t want to type A2, you can click the field you want to reference or use your keyboard arrow keys to navigate. A2 is a cell reference.

For the last field, Book Type, we want to differentiate between a paper book and ebook. We will use the IF function which works like this: “=if(Condition, True result, False result)”. Our function will look like this: =IF(OR(C2=”NYC Store”,C2=”LA Store”),”pBook”,”eBook”)

Broken down, the function looks like this:

=IF ( //Start the If statement – use the condition to determine what to display
OR( //we only have 2 stores, so to make the function simpler/quicker, we’ll check to see if the channel was either store. If so, it was an ebook, if not, it was a paper book.
C2=”NYC Store”,
C2=”LA Store”
),
“pBook”, //if the condition (either store) was met, display pBook
“eBook” //if neither condition was met, display eBook
)

If an If statement/function is new to you, just think about it logically. “If THIS, then show this, OTHERWISE, show this.” It takes some practice, but it isn’t rocket science. With our new fields, our data should now look like this:

Note: Weekday is a number representation for the Day of Week. The default is 1=sunday, 7= saturday, but this can be changed (see http://www.techonthenet.com/excel/formulas/weekday.php)
Adding these addition fields will actually create a HUGE amount of value (and processing time) when utilizing Pivot Tables.

Using Pivot Tables

A pivot table is a live, customizable (on the fly) data table generated from a set of data. Functionality of Pivot Tables has grown/changed quite a bit since its initial creation, but the basic concepts have held, so that is what I’ll cover. To create a pivot table, simply select ALL the data you want (headers as well). If it is an entire spreadsheet, you can click the <> in the top left to select all. Once selected click the Data (menu) -> Pivot Table Report. You will be asked where you want to pull data from (if adventurous, you can actually link Excel up to your database to directly download data. For this tutorial, I’m only going to discuss using data on a spreadsheet, which is the default option). Since you’ve already selected your data, click “Next >”. Again, your data is selected, click “Next >”. Your data will be read (if you have alot of data, this can take alot of time).

The last step asks you where you want to create the pivot table. I normally create it on another sheet, for cleanliness, but thats your choice. Click finish.

I don’t see anything! What tom-foolery is this? That is correct — you start with a blank table that tells you to “Drop Data Items Here” and “Drop Rows/Columns here.” This is where the magic happens. You get to customize the data you want to see, how you want to see it, and how it is to be organized. Along with your pivot table should also be a list of columns to drag/drop.

The fields you see are exactly the ones that you had previously. I dragged (in the left side): Channel, Year, Month, Day. In the “drop data items here” I dragged” Price. Initially, whatever you drag into “Drop Data Items Here” will result in a COUNT of the items. For price, you want sum, not price. To change that, right-click on any of the numbers listed under “Price” (Total) and select “Field Settings.” From there, you can choose how you want your data displayed – average, sum, count, min, max, etc.

My data set is limited, but here are the results of that:

I can now see my sales, by channel, broken down by year, month, then day. I can also change the options of whether or not to display the totals for each column.

Play around

There are quite a few very good books on how to do Pivot Tables, but I’ve learned the best from just playing around. I took a sample of about 1000 real records (I’m sure you have an excel spreadsheet of data lying around somewhere). I made a copy of it, and went to town. Remember — a Pivot Table doesn’t actually change the data, so if you mess it up, simply delete it and start again!

Where does Day of Week come in? Well, you could use it to find trends in buying. If you notice that 40% of your store sales happen on Monday, but 80% of your ebook sales happen on Saturday/Sunday that can greatly effect your marketing (do a big digital push on Friday and Saturday. Basically it was to demonstrate that there are very simple functions that extrapolate data from something like a date that can help you better visualize data.

Other useful functions for data reconfiguration

Like the weekday() function (from now on, I’ll use () at the end of a word to denote that it is a function, as that is standard programming style). there are tons of built-in functions with excel that help you extrapolate useful data out of other data. You can find a great list of Date functions here. The way to read the following functions is like this: function_name(inputs). For most inputs, it is a cell (A3, B9, etc) or a range (A2:A9).

  • Sum(range) — Will provide you a sum of all the numbers within a range. If a word/string is encountered, it will be ignored.
  • average(range) — Provides the average of the numbers within a range. If a word/string is encountered, it will be ignored.
  • isodd/iseven(cell) — Will tell if the number in the referenced cell is odd/even
  • or(condition1, condition2) — Will return 1 (true) if either condition is met
  • and(condition1, condition2) — Will return 1 (true) if both conditions are met
  • concatenate(cell/string, cell/string) — Will merge 2 cells (as text). You can also simply note text to add. So you could do =concatenate(“Bought from “,C2) to add “Bought From ” to the beginning of the purchase channel.
  • mid(cell, start position, number of characters) — Will give you part of a text. You define the start point and how many characters. So if you wanted just the first 3 characters you’d use: mid(A1, 0, 3)
  • substitute(cell, old text, new text) — Does a basic find/replace. So, if you wanted to remove “store” from your channel, you would do: =substitute(C2, ” Store”,””)

Understanding vLookup

vLookup is a great way to map multiple values to one, or translate data quickly/efficiently. In the prior example, we used an moderately complex if() with an or() in it to determine if it was a paper book or ebook. But, if we had 50 stores and 5 digital channels, that would quickly get very long and hard to read. For cases like this, vLookup becomes the correct solution. On another spreadsheet (same file) all you need to do is create a lookup table. Make it 2 columns, the first column is the value you have, and the 2nd column is the one you’d like vLookup to display.

For my example below I used the function: “=VLOOKUP(C2,$A$19:$B$23,2, FALSE)” with the following dataset:

To explain the items in vLookup and how I chose to do them:

  1. C2 in the formula is my reference to the Channel. I started at the top, and entered this into J2 (I hide some fields, hence the jump from C to J.
  2. The 2nd argument is your lookup range. I used my mouse to select A19:B23 — my vLookup table below. I recommend using another sheet in the same workbook to keep things clean, but for example purposes and easy screenshots, I did it this way.
  3. Since I wish to copy this formula, but I don’t want the reference table to be relative, I added $s (see below for explanation) to my reference table
  4. The next argument is the field you’d like displayed. The 2nd field in my table was what I wanted displayed, so I put “2”
  5. The final argument is a boolean (true/false) and is tells if vLookup should be a range search or not. Range search allows you to have a table of numbers and give a reference to something like 2.5 (when you only list 2 or 3 as values). For text, you should make it false.

Know where your money goes

Money, yay! Just kidding. Excel has a wonderful feature to be able to “move” a reference when you copy. Thats what lets you do ctrl-d to copy something down (or copy/paste). For example, if you have a function “=A2+B2” and copy it to the 3rd row, it will become: “=A3+B3” Similarly if you copy it 2 columns over it becomes “=D2+E2”. But what if you don’t want that? Say you wanted to multiply everything by a single cell at the top of your document (an assumption value for example). You can do this using the $. $ in excel means “hold this reference.” You can hold a column, a cell, or both. So, if you had “=$A2+B2” then moving it down would be: “=$A3+B3” and moving it right would be “=A2+E2”. If you wish to fix both column AND row, you use two $s: “=$A$2+B2”

This stuff is complicated

The final special character that I’ll touch upon here is the !. If you want to reference a cell on a different worksheet (same excel file) you simply use [worksheet_name]!, so “sheet2!A5”. But, a nice shortcut for people who have long/complex sheet names is simply to start writing the function, and when you come up to the cell reference input, just click on the sheet you want and click on the cell. Excel will auto-fill in the sheet & cell. Keep in mind, it will not do a static reference (use of $), so if you are using an assumptions sheet, you’ll need to add those in after you finish the function before you copy.

Touching upon VBA for the adventurous

Excel for Windows has a built-in programming language called VBA (Visual Basic for Applications). While it can be finicky, it is actually extremely powerful. To give you an idea how powerful it is, I’ve seen complete accounting systems and stock-trading applications written in VBA/excel. For tons of reasons, VBA is not supported in Excel for Mac (all the other functions are). VBA allows you to use your worksheet as a UI and for complex code to be written where the built-in functions of excel are not good enough. There are also a huge number of scripts already written that are downloadable via the internet — but do beware, VBA is powerful, which means that running a script with malicious code could delete the current spreadsheet or potentially all the files on your computer. So, before running any scripts downloaded from the internet, make sure you know the basics of coding and walk through the script to understand what it is doing (and where potential harm may be).

The warning stated, writing VBA is like writing any other programming language. You add a button to your spreadsheet, then give it some functional code. The built-in IDE for VBA is actually quite good and it includes autocomplete for function names, it tells you what parameters are needed and what the expected outcomes should be. The compiler errors aren’t very descriptive, but offending code should be highlighted (in the way that incorrectly spelt words are highlighted in Word).

VBA allows for greater integration with databases, allows for automation, and can turn an ordinary spreadsheet into a useful tool. VBA Can be used to send emails, generate reports, and reformat data very quickly. In fact, there are even some games written completely in VBA/Excel.

About the Author

Nick Ruffilo

Nick Ruffilo is currently the CIO/CTO of Aerbook.com. He was previously Product Manager at Vook and CTO of BookSwim.