Is Data Visualization still just a buzzword?

If asked this question two years ago, my answer would have been a resounding “Yes.” At that time, the focus was on displaying data using visualizations that were pleasing to the eye to help make a sale, impress the “C” levels, and to put a check mark in the box of capabilities. But times, and needs, have rapidly changed.

The tools available just two years ago were cumbersome to work with and the landscape only had a few major players. Fast forward to today, and the landscape is full of choices in both on premise and cloud based tools. It’s not just the toolsets that have changed however. It is also the maturation of best practices that have brought data visualization from a buzzword to a necessity for all companies, large and small.

So what are some of the best practices? How do I make data visualization succeed for my organization?

Here are a few keys to ensuring success.

1. Use the right visualization.

It’s a lot like when the newest iPhone comes out, and everyone races to be the first to get it. Only after the purchase however, you realize that there wasn’t that much of a difference from the previous version and you gain no productivity. Don’t be in a rush to use a new visualization just to say you can use it. Make sure that it tells the user something and is organized in a way that they can learn quickly. Nothing will kill a dashboard or report more quickly than ineffective visualizations.

2. 10 seconds to learn

The key to any dashboard or report is that the consumer must be able to have at least one takeaway in the first 10 seconds. This will lead to a higher adaption rate and make your audience come back begging for more data.

3. More isn’t better

Don’t overwhelm your audience with a dashboard or report that has too many visualizations. We have all seen demo dashboards with 10, 15 or even more visualizations. Although it looks impressive, it’s best to ask yourself questions such as: ‘Did I learn anything?’ and / or ‘Did it tell me a story?’ Most likely your answer will be no.

4. Tell a Story

Make sure that the data is always telling a story and not just being displayed for the sake of displaying. This is often found in situations where an attempt has been made to move an operational report into a visualization tool. Operational reports have their place in the story but it should be 2 or 3 levels deep with supporting visualizations first that have a drill path to the transaction level details.

5. Use Files/Slicers

Don’t create multiple copies of the same report just to show a different product, category or client. Use filters and/or slicers to help slice and dice your report and give your audience one place to go to get all their data. This gives one other benefit that individual reports don’t give and that is aggregate values across multiple products, categories and clients.

Take a look at the two reports below and try to apply the keys to data visualization above.

Report 1.

clip_image002

Report 2.

clip_image004

Did you notice that each report has the same data? Were you able to learn more from Report 2? While both reports have the same data, the right visualizations makes a difference in immediate comprehension. Using column charts to show trended data over time verses a bar chart allows the eyes to easily move from left to right and see the patterns. Next, tree maps are a great replacement for the age old pie chart, as not only do they show large amounts of data in a readable format, they also take your eyes in a progression from left to right as the size decreases. Additionally, funnel charts tell a much more concise story compared to a bar or line chart and draw you to not only see the values decrease, but also creates a visualization that quickly shows the disparity between values.

What’s next? If you haven’t already begun to adapt the concepts of data visualization within your organization, now is one of the best times to start. There is a right tool out there for everyone. If you find yourself wondering where to begin, reach out to a provider already offering data visualizations in your area of expertise. You don’t always have to recreate the wheel to take advantage of the areas of efficiency that they offer.

Dating Tips Part 1

This is the first in a series of dating tips, so if you need you’ve come to the right place.

 

Well ok tips about how to use Dates in a Data Warehouse, Power Pivot, and Analysis Services both Tabular and Multidimensional.

So here in part 1 we are going to start at the basics and cover the date table. The date table is where all the magic happens and without it our dates are pretty meaningless.

Why is the table so important?

  1. Without a date table almost all Time Intelligence functions will not work
    1. Year to Date
    2. Rolling Quarters
    3. Parallel Period
    4. Etc…….
  2. Can’t build Date hierarchies such as (Year, Quarter, Month, Date)
  3. Will not be able to support Fiscal Calendars

Here is a script that will generate a Date Table for you. I am populating the table using a recursive CTE which does have a limit of loading 32767 dates.

There are also 3 Variables that you need to set the values for

@StartDate default is 1/1/1999

@EndDate default is 1/1/2050

@FYStartDate default is 4/1/1999, The year of this must be the same fiscal year as the StartDate

Use Lookups the right Way

SSIS Lookup tasks are all too easy to abuse as they are easy to setup and not really think about much. Most commonly developers just drag the task in setup a reference table, map a column or two, add a column to be returned and click OK. Now while doing that will work no doubt it can cause a performance nightmare. So let’s setup this scenario then break down why you have to think about taking the easy way out.

Scenario:

Let’s use our favorite friend AdventureWorksDW and say we want to load the FactInternetSales table, which has 8 dimensions keys and those dimensions would be.

  1. DimProduct
  2. DimDate (OrderDate)
  3. DimDate (DueDate)
  4. DimDate (ShipDate)
  5. DimCustomer
  6. DimPromotion
  7. DimCurrency
  8. DimSalesTerritory

So if we created a SSIS package to load the using the easy way described earlier we would have a Source and 8 Lookups just to handle getting the lookup keys portion of this package. If all I did was create my lookups as table references I would have a data pull that looked something like this.

TableName Number of Rows Number of times Queried Row Size (KB) Total Data Size (KB)
DimProduct 606 1 7.79 4720.74
DimDate 2191 3 .14 920.22
DimCustomer 18484 1 1.84 34010.56
DimCurrency 105 1 .11 11.55
DimPromotion 16 1 2.12 33.92
DimSalesTerritory 11 1 .30 3.3

However, we don’t need all the fields from these lookups for we only need the fields to join on and the field to set the Dimension Key for. Each of the table queries should look more like this.

  1. DimProduct (ProductKey, ProductAlternateKey)
  2. DimDate (DateKey, FullDateAlternateKey)
  3. DimCustomer (CustomerKey, CustomerAlternateKey)
  4. DimPromotion (PromotionKey, PromotionAlternateKey)
  5. DimCurrency (CurrencyKey, CurrencyAlternateKey)
  6. DimSalesTerritory (SalesTerritoryKey, SalesTerritoryAlternateKey)

Now the Data footprint will be the below.

TableName Number of Rows Number of times Queried Row Size (KB) Total Data Size (KB) Reduction in Size %
DimProduct 606 1 .05 30.3 99.36%
DimDate 2191 3 .007 15.337 98.33%
DimCustomer 18484 1 .03 554.52 98.37%
DimCurrency 105 1 .0098 .049 99.56%
DimPromotion 16 1 .0078 .1248 99.63%
DimSalesTerritory 11 1 .0078 .0858 97.4%

As you can see the Reduction in size is significant and it is effected in three areas: The Database Engine, Network and SSIS Engine all have to deal with less data now. So please never choose a table as your source but rather take the time to write the query and save yourself from performance headaches down the road and you might make your DBA happy too.