Archive for the ‘Stephen Few’ Category:
The High Performance HMI Handbook
A Comprehensive Guide to Designing, Implementing and Maintaining Effective HMIs for Industrial Plant Operations
Bill Hollifield, Dana Oliver, Ian Nimmo, and Eddie Habibi, PAS, 2008
Dashboard displays come in many types, depending on the nature of the information that’s being monitored. While it’s true that all monitoring displays share many best design practices in common, each situation requires specialized designs as well. For example, an airplane cockpit display should look and function quite a bit differently than a business sales dashboard. A book entitled The High Performance HMI Handbook (2008) provides design guidance specifically for displays that are used by control operators in industrial plants. (HMI is an acronym for “Human Machine Interface”.) Apparently, the vendors that develop these systems are like most business intelligence vendors: they don’t understand how to present information effectively, especially for data monitoring and analysis. In fact, they promote really bad data presentation practices. Here’s an example of a typical industrial control room display.
If you’ve read my book Information Dashboard Design and now go on to read The High Performance HMI Handbook by Bill Hollifield, Dana Oliver, Ian Nimmo, and Eddie Habibi, you might think that either they or I copied the other’s material. When I wrote my book in 2006, however, I wasn’t familiar with the work of these authors, and I have no reason to believe that they were familiar with mine when they wrote their book last year. The reason the principles and practices presented in our books are so consistent with one another-in many cases down to precise details-is because we are drawing from the same research literature (human factors, human-computer interface design, cognitive science, information visualization, graphic design, etc.) and have both honed our expertise through years of designing practical, real-world data display solutions.
Where our books vary is due to differences between business dashboard requirements and displays that are used to monitor real-time industrial operations. Their book is rich in details that apply specifically to control room monitoring, down to the ideal configuration of display devices and the screen colors that provide optimal readability in a typical control room. It is because of the highly specific and therefore limited nature of this book’s audience that it bears the high price tag of $129.99. If you need to design displays for control operators, however, this price is a pittance compared to the benefits that you’ll derive from reading this book.
Let me share a few brief excerpts from the book to give you a peek into its contents.
Regarding the ineffective and irresponsible data display practices of the vendors that develop Distributed Control System (DCS) software:
There is a widespread need for the information in this book. It is not provided by the DCS manufacturers. In fact, the DCS manufacturers usually demonstrate their graphic capabilities using example displays violating almost every good practice for HMIs.
DCS vendors have now provided the capability of creating HMIs with extremely high sophistication, power, and usefulness. Unfortunately, this capability is usually unused, misused, or abused. The basic principles of effective displays are often not known or followed. Large amounts of process data are provided on graphics, but little information. Information is “data in context made useful.” In many cases, the DCS buyer will proceed to design and implement graphics based on the flashy examples created to sell the systems - unaware from a usability and effectiveness point of view, they are terrible.
Sound familiar? This next excerpt will sound familiar as well. Just as business intelligence vendors promote do-it-yourself solutions without providing the guidance that people need to analyze and present data effectively, DCS manufacturers leave it to companies to design their own control displays.
We would think it strange if Boeing sold jetliners with empty cockpits, devoid of instruments logically and consistently arranged for use by the pilot. Imagine if they suggested, “Just get your pilots together and let them design their own panel. After all, they’ll be using it.”
Imagine if your car came with a blank display screen and a manual on how to create lines, colors, and shapes so you could build your own speedometer. While this might actually appeal to the technical audience of this book, the rest of the world would think it strange and unacceptable. And, could you operate your car by using the “Doom” display motif created by your teenage son?
This do-it-yourself approach, without consistent guidance, is the general case with industrial graphics and is a major reason the results are often poorly conceived, inconsistent, and generally of low quality and performance.
Here’s a short quote that will grab your attention.
A graphic optimally designed for running a process and handling abnormal conditions effectively will, in fact, look boring.
Effective monitoring displays don’t “Wow” people with immediate graphical appeal. People often look at my dashboard designs and think “Where are the colors and those cute gauges that I like so much?” Here are a few of the characteristics that are listed as effective for HMI displays:
- Important information and Key Performance Indicators have embedded trends.
- There is no gratuitous animation.
- There is very limited use of color and alarm colors are used only to display alarms and nothing else…Bright, intense (saturated) color is used only for quickly drawing the operator’s attention to abnormal conditions and alarms. If the process is running correctly, the screen should display little to no color.
- Equipment is depicted in a simple 2-D low-contrast manner, rather than brightly colored 3-D vessels with shadowing.
- Layout is generally consistent with the operator’s mental model of the process.
This is all that I’ll share as a glimpse into The High Performance HMI Handbook. If you’re responsible for designing effective industrial control system displays, $129.99 is a small price to pay for the useful guidance in this book.
What Intelligence Tests Miss: The Psychology of Rational Thought, Keith E. Stanovich, Yale University Press, 2009.
Many prominent thinkers over the last few years have pointed out that IQ tests fail to test many abilities of the mind that are useful for making our way in the world. Most argue that there are types of intelligence other than what IQ tests measure, such as emotional intelligence. In his insightful book What Intelligence Tests Miss: The Psychology of Rational Thought, Keith Stanovich frames the problem differently. He argues that it is acceptable and even useful to limit the term “intelligence” to the abilities that IQ tests measure, but that in addition to Intelligence Quotient (IQ), a measure of algorithmic thinking, we should also assess Rational Quotient (RQ), a measure of reflective thinking. IQ fails to measure our ability to exercise good judgment and to make good decisions. Smart people often do dumb things. This is because there is almost no correlation between intelligence and our ability to think rationally, that is to avoid thinking errors that lead to poor judgments and the resulting bad decisions that undermine our best interests.
Testing IQ has become thoroughly integrated into American society and values. Stanovich points out that “In our society, what gets measured gets valued.” IQ, either measured directly or through proxy tests such as the SAT, has become the standard that determines academic and professional opportunities, yet it entirely fails to measure people’s ability to think rationally, which is every bit as important. In fact, because rationality has been so thoroughly ignored, we now have an American workforce that is sadly lacking in this critical ability. People throughout organizations, from the lowliest workers to the loftiest leaders, make bad decisions that undermine their interests based on irrational, error-prone assessments of the situations rather than rational consideration of available evidence. In his book Breakdown of Will (2001), George Ainslie describes the situation we find ourselves in today:
The prosperity of modern civilization contrasts more and more sharply with people’s choice of seemingly irrational, perverse behaviors, behaviors that make many individuals unhappier than the poorest hunger/gathers. As our technical skills overcome hunger, cold, disease, and even tedium, the willingness of individuals to defeat their own purposes stands in even sharper contrast.
That we focus so much attention on intelligence and value it so greatly is not the problem; the problem is that we focus on and value it so exclusively. It makes perfect sense to value intelligence, because life in the modern world has become increasingly complex. This is certainly true of business. Consider the world of banking. In the movie “It’s a Wonderful Life,” the local banker who lent money to familiar folks in his community could have managed with less intelligence than the bank executives of today who oversee dozens of departments that each handle a host of intricately complicated financial transactions. But back then and now, a high level of rationality has always been required. Its importance, nevertheless, has become under-appreciated.
Legal scholar Jeffrey Rachlinski points out a problem with the way professionals are trained today:
In most professions, people are trained in the jargon and skill necessary to understand the profession, but are not necessarily given training specifically in making the kind of decisions that members of the profession have to make.
On several occasions, I’ve written and spoken about this problem, especially as it relates to data analysis and presentation. Rather than learning the concepts and skills that are required, we put a software product on employees’ computers and assume that this is all they need. Software vendors have long promoted this line of reasoning in the way that they market their “intuitive,” “self-service” products.
The truth is, we need a full range of cognitive skills to face the challenges of the workplace and of life in general. As a culture, we must embrace rationality by promoting its value and supporting its development and use as thoroughly as we’ve embraced intelligence.
What does Stanovich mean by rationality?
To think rationally means adopting appropriate goals, taking the appropriate action given one’s goals and beliefs, and holding beliefs that are commensurate with available evidence.
Not everything in life requires rationality or even intelligence (that is, what IQ measures). Much of what we do is effectively managed through autonomous mental processes that involve neither. This is great, because it frees up the higher-order processes of cognition, which require conscious attention and greater energy, from being wasted on menial tasks. Walking and even driving are activities that are handled primarily by the autonomous mind. While the autonomous mind does a great job, we get into trouble when we let it handle situations that require higher-order cognition—intelligence and rationality rather than the automatic rules of thumb that the autonomous mind uses to make decisions. One of the important roles of the reflective mind is to interrupt autonomous processing when higher forms of thinking are required. To make better decisions, we need to value and develop the strengths of our reflective minds. Two important rational abilities, especially for data analysis, are the ability to reason logically and the ability to think in terms of probabilities. Unfortunately, relatively few people have been trained in these skills.
Stanovich explains how thinking works based on this tripartite model consisting of the autonomous mind, algorithmic mind, and reflective mind. He talks about the role, importance, strengths and weaknesses of each. He spends a lot of time describing the causes of errors in rational processing (what he calls dysrationalia) and how we can avoid them. And, thankfully, he gives us hope by showing that rational thinking, unlike most intelligence, can be learned. He’s on a mission to make this happen. If you believe in the importance of rationally informed decision making and agree that it’s lacking, I recommend that you read this compelling book.
A few days ago I noticed a blog posted by Boris Evelson of Forrester Research titled “How to Differentiate Advanced Data Visualization Solutions.” Forrester is one of the leading IT research and advice companies. Along with its larger rival, Gartner, these companies serve as trusted advisers to thousands of organizations, helping them make decisions about all aspects of information technology. Although it’s convenient for Chief Information Officers (CIOs) to subscribe to a single service for all the advice they need, is this approach reliable? It depends on whether we actually get advice from someone who has the expertise we’re missing. Far too often when relying on these services, however, we get advice from people whose range of topics is too broad to manage knowledgeably. We sometimes find ourselves being advised by someone who understands less about the topic than we do. If you’re looking for advice about data visualization products, based on what I read in Forrester’s blog, I suggest that you look elsewhere.
Evelson provided a list of features that he believes we should look for when shopping for an advanced data visualization solution. Unfortunately, his list looks as if it was constructed by visiting the websites of several vendors that claim to offer data visualization solutions and then collating the features that they offer. I expect more from a service that people pay good money to for advice. We can’t trust most vendors that sell data visualization software to tell us what we should expect from a good product. It is in their interest to promote the features that they offer, and only those features, whether they’re worthwhile or not. In fact, most vendors that offer so-called data visualization solutions know little about data visualization.
Another problem with Evelson’s advice is that it isn’t clear what he means by “advanced data visualization solutions.” What distinguishes advanced solutions from the others? Of the few features on his list that actually characterize an effective data visualization solution (most of his list misses the mark, as I’ll show in a moment), none go beyond the basic functionality that should exist in every data visualization solution, not just those that are “advanced.”
Evelson has offered the kind of analysis and advice that we get from people who dabble in data visualization, rather than those who have taken the time to develop, not just shallow talking points, but an understanding of what’s really needed and what really works.
Let’s take a look at each feature on Evelson’s list in the order presented and evaluate it’s worth.
Feature #1: “If it’s a thin client does it have Web2.0 RIA (Rich Internet Application) functionality (Flash, Flex, Silverlight, etc)?”
Response: This is a feature that only a IT guy with myopia could appreciate, not someone who actually analyzes and presents data. When evaluating software, we care about functionality and usability, not about the specific technology that delivers it. If we’re exploring and analyzing data via the Web, what matters is that interactions are smooth, efficient, easy, and seamless. How this is accomplished technically doesn’t matter.
Feature #2: “In addition to standard bar, column, line, pie charts, etc how many other chart types does the vendor offer? Some advanced examples include heat maps, bubble charts, funnel graphs, histograms, pareto chats, spider / radar diagrams, and others?”
Response: So it’s the number of chart types that matters? What constitutes a chart type? Do useless chart types count? This is a lot like giving high marks to the software programs with the most lines of programming code, as if that were a measure of quality and usefulness. What matters is that a data visualization solution supports the types of charts that do what we need and that they work really well. Many data visualization products could be dramatically improved by removing many of the silly charts that they offer rather than by adding more to the collection.
Feature #3: “Can the data be visualized via gadgets/widgets like temperature gauges, clocks, meters, street lights, etc?”
Response: Is Evelson serious? Should vendors get points for providing silly, dysfunctional display gadgets? Most of the gauges, clocks, meters, and street lights that many so-called data visualization products provide are worthless. Anyone who understands data visualization knows this to be true. This is what Evelson looks for in “advanced” data visualization solutions?
Feature #4: “Can you mash up your data with geospatial data and perform analysis based on visualisation of maps, routes, architectural layouts, etc?”
Response: While the ability to view and interact with data geo-spatially is critical, most of the “mash-ups” that vendors enable are horribly designed, and thus of little use. Throwing quantitative data onto a Google map doesn’t qualify as effective data visualization. Google maps (and other similar services) were not designed as platforms for quantitative display, but instead as sources for directions (”How do I get from here to there?”). Good geo-spatial data visualization uses maps that are designed to feature quantitative data only within the context of geo-spatial information that adds meaning to the data. What’s also important is that geo-spatial displays can be combined on the screen simultaneously with other forms of data visualization (for example, bar graphs, line graphs, tables, and so on) to provide a fuller view of the data than geography alone.
Feature #5: “Can you have multiple dynamically linked visualization panels? It’s close to impossible to analyze more than 3 dimensions (xyz) on a single panel. So when you need to analyze >3 dimensions you need multiple panels, each with 1-3 dimensions, all dynamically linked so that you can see how changing one affects another.”
Response: This is probably the clearest description on Evelson’s list of a feature that is actually useful and indeed critical. Whether the separate views of the data set appear in separate panels or not isn’t important however. What’s important is the ability visualize the data in multiple ways–that is, from multiple perspectives on the screen at once. Only then can we construct a comprehensive view and spot relationships, which would be impossible if we were forced to examine each view independently, one at a time.
Feature #6: “Animations. Clicking through 100s of time periods to perform time series analysis may be impractical. So can you animate/automate that time period journey / analysis?”
Response: So far, researchers have only found a limited role for animation in data visualization, especially for data analysis. When Hans Rosling of GapMinder uses bubble plots to tell a story, such as the correlation between literacy and fertility throughout the world and how it has changed through time, bubbles (one per country) that move to display change through time work because he is narrating–telling us where to look and what it means. Research has shown, however, that these same animated bubble plots are of limited use for data analysis. We simply cannot watch all those bubbles as they follow their independent trajectories through the plot. To compare the paths that two bubbles have taken through time by means of animation, we must mark the path with trails that provide static representations of the bubbles’ journeys. Too many software vendors are providing animations that are nothing more than cute tricks to entertain, rather than useful visualizations. We should run from any vendor that has actually taken the time to make the pointers on their silly gas gauges wobble back and forth for several seconds until they eventually stop moving and point to the value that we need.
Feature #7: “3 dimensional charts. Can you have a 3rd dimension, such as a size of a bubble on an XY axis?”
Response: Simply asking a vendor if his products support 3-D displays is the wrong question. 3-D pie charts, bar graphs, and line graphs are almost never useful. Most implementations of 3D in so-called data visualization products are either entirely gratuitous and thus distracting, or far too difficult to read. The example that Evelson gave, however–the ability to add a third quantitative variable to a scatterplot by allowing the data points to vary in size to represent a third quantitative variable–is actually useful, assuming the vendor designs this feature properly. That’s a big assumption.
Feature #8: “Can you have microcharts (aka trellis) — a two dimensional chart embedded in each row or cell on a grid?”
Response: Evelson is onto something here, but he seems a bit confused about the terms. “Microcharts” is the name of an Excel add-in product from Bonavista Systems. A microchart is a small chart, such as a sparkline or a bullet graph, which conveys rich information in a small amount of space, such as in a single spreadsheet cell. A “trellis” display, what Edward Tufte has been calling “small multiples” for many years, is something quite different. It is a series of charts that breaks a data set into logical subsets, each with the same quantitative scale, arranged within eye span on a single screen or page, for the purpose of making comparisons between the charts. For example, if the correlation between the number of sales contacts and sales revenues for 500 customers and 20 separate products would be too cluttered and complex if displayed in a single scatterplot, we might be able to solve this problem by creating a trellis display of 20 scatterplots, one per product.
Feature #9: “Can you do contextual or gestural (not instrumented, not pushing buttons, or clicking on tabs) manipulation of visualization objects, as in video games or iPhone like interface?”
Response: Evelson might be getting at something useful here, but he hasn’t distinguished the gratuitous video game-like interactions that have become all too common in many so-called data visualization products from useful interactions that are needed to uncover meanings that live in our data, which only a few products actually support. For data exploration and analysis, it’s quite useful to interact with visualizations of data directly to change the nature of the display in pursuit of meaning, such as to sort or filter data. For instance, rather than using a separate control or dialog box to remove outliers in a scatterplot, it’s useful to be able to grab them with the mouse (or with your finger on a touch screen) and simple throw them away.
Feature #10: “Is the data that is being analyzed
a) Pulled on demand from source applications?
b) Stored in an intermediary DBMS
c) Stored in memory? This last one has a distinct advantage of being much more flexible. For example, you can instantaneously reuse element as a fact or a dimension, or you can build aggregates or hierarchies on the fly.”
Response: What really matters is not where the information is stored, but how easily, flexibly, and rapidly we can access and interact with the data that we need. How this is accomplished technically needn’t concern us as long as it works.
Feature #11: “Is there a BAM-like operational monitoring functionality where data can be fed into the visualization in real time?”
Response: When real-time data updates are needed, this is a useful feature, but few data visualization solutions require real-time updates.
Feature #12: “In addition to historical analysis, does visualization incorporate predictive analytics components?”
Response: This is indeed useful, but what many vendors call “predictive analytics” are neither predictive nor analytical. Rather than simply asking vendors if they support predictive analytics (you will never get a “No” answer to this question), we should questions such as: “Can the software be used to build effective predictive models (that is, those that are statistically robust) that allow us to not only determine the probability of particular results under particular conditions, but also to see, understand, and therefore reason about the interactions between variables that contribute to that result?”
Feature #13: “Portal integration. If you have to deliver these visualizations via a portal (SharePoint, etc) do these tools have out of the box portal integration or do you need to customize.”
Response: Generic portal integration isn’t important. If you use a particular portal product and you need the analytics tools to integrate with it, then this specific requirement might be useful to you. This should not, however, be a reason to reject an otherwise effective data visualization solution. There are so few good solutions to choose from today, don’t let someone in your IT department turn away the one that’s useful to you because it doesn’t integrate neatly into your organization’s portal.
At the end of his list of features, Evelson asked, “What did I miss?” I appreciate his openness to suggestions. More than what he missed, however, I’m concerned about the features that he included that are either unimportant or that in some cases actually undermine data visualization.
Fundamentally, Evelson missed the opportunity to assess the effectiveness of data visualization solutions. Lists of features–even good ones–fail to do this. Another fundamental problem is that his list lumps all data visualization solutions together, as if every purpose for which data visualization might be used requires the same functionality. This is far from the truth. Uses of visualization for monitoring, analysis, or communication, although they share much in common, require many distinct features as well. When shopping for data visualization software, you must first know what you plan to accomplish with it and then determine the features that are specifically required for that purpose. Unless you’re planning to use a single tool for all purposes, you won’t need everything that a data visualization solution could possibly offer.
Evelson is but one of many people that organizations erroneously trust for critical advice. Regarding data visualization, he lacks the expertise that’s required and legitimately expected. Anyone who sets himself up as an adviser–especially one that organizations pay for dearly–ought to develop deep expertise in the subject matter. Before we can shop effectively for technology, we must first shop effectively for reliable sources of advice.
In my work at Perceptual Edge, I help organizations analyze their performance and present what they find in tables and graphs, which often involves dashboards. People who attend my courses in dashboard design often ask for help in figuring out how to measure the performance of their organizations. Until you know what aspects of performance to measure and how to measure them, you can’t even begin to develop a dashboard. I don’t cover this important aspect of performance management in my work, but others do, including Stacey Barr, the Performance Measurement Specialist.
Organizations often use Key Performance Indicators (KPIs) that are vague, unactionable, or downright meaningless. So, how can you make sure your organization identifies and measures KPIs that are actually useful? Once you’ve identified the measures, how do you get people to support your measurement effort and how do you translate these measures to an organizational strategy? One good way is to attend Stacey Barr’s upcoming Performance Management Blueprint Workshop in Las Vegas on November 10 and 11. Stacey, who’s based in Australia, has been helping people around the world for the last ten years, and her Las Vegas workshop is a rare opportunity to see her in the United States. If your organization needs help with performance management, I highly recommend Stacey’s workshop.
I had the great pleasure last Thursday of hearing Malcolm Gladwell, journalist and author of the books Outliers, Blink, and The Tipping Point, speak at SAS Institute’s Innovators’ Summit in Chicago. I gave one of two keynote presentations in the morning and Gladwell gave the keynote in the afternoon. I believe that Gladwell is one of the great thinkers and communicators of our time, and his words on Thursday afternoon led me to believe this even more fervently.
Gladwell’s topic was well-chosen for a group of people who spend their time making sense of data (mostly statisticians) using SAS’ visual analysis product JMP. He spoke about problem solving and the fact that our problems today are different from those of the recent past. Our former problems were usually solved by digging up and revealing the right information. He used Watergate as an example, pointing out that key information was hidden, and the problem was solved when Washington Post journalists Woodward and Bernstein were finally able to uncover this information that had been concealed. Modern problems, on the other hand, are not the result of missing or hidden information, Gladwell argued, but the result, in a sense, of too much information and the complicated challenge of understanding it. Enron was his primary example. The information about Enron’s practices was not kept secret. In fact, it was published in several years’ worth of financial reports to the SEC, totaling millions of pages. The facts that led to Enron’s rapid implosion were there for anyone who was interested to see, freely available on the Internet, but weren’t understood until a journalist spent two months reading and struggling to make sense of Enron’s earnings, which led him to discover that they existed only as contracts to buy energy at a particular price in the future, not as actual cash in the bank. The problems that we face today, both big ones in society like the current health care debate and smaller ones like strategic business decisions, do not exist because we lack information, but because we don’t understand it. They can be solved only by developing skills and tools to make sense of information that is often complex. In other words, the major obstacle to solving modern problems isn’t the lack of information, solved by acquiring it, but the lack of understanding, solved by analytics.
Gladwell’s insights were music to my ears, because he elegantly articulated something that I and a few others have been arguing for years, but he did so in a way that was better packaged conceptually. Several months ago I wrote in this blog about Richards J. Heuer’s wonderful book Psychology of Intelligence Analysis, and featured his assertion that we don’t need more data, we need the means to make sense of what we have. More directly related to my work in BI, I’ve stated countless times that this industry has done a wonderful job of giving us technologies for collecting and storing enormous quantities of data, but has largely failed to provide the data sense-making tools that are needed to put data to use for decision-making.
The title of my keynote at the Innovators’ Summit this year was “The Analytics Age.” I argued that the pieces have finally come together that are needed to cross the threshold from the Information Age, which has produced great mounds of mostly unused information, to the Analytics Age, when we’ll finally learn how to understand it and use it to make better informed, evidence-based decisions. The pieces that have come together include:
- plenty of information
- proven analytical methods (especially statistics enhanced through visualization)
- effective analytical tools (only a few good ones so far, but this will change)
- a growing awareness in society that analytics are needed to replace failed decision-making methods, based on whim and bias, that have led us to so much trouble
Although many software vendors claim to sell analytics tools, seeking to exploit the growing awareness that analytics are powerful and necessary, few actually understand this domain. Their products demonstrate this fact as silly imitations of analytical techniques. This is true of every traditional BI software vendor in the market today. As Gladwell pointed out, the paradigm has shifted; the skills and methods that worked in the past can’t solve the problems of today. Only a few software vendors that play in the BI space (none of which represent traditional BI) have the perspective and knowledge that is required to build tools that can help us solve modern problems. Most of these have either evolved from a long-term focus on statistics, such as SAS Institute, or have emerged as spin-offs of academic research in information visualization, such as Tableau and Spotfire. If traditional BI vendors want to support the dawning analytics age, they must retool. They must switch from an engineering-centric worldview, focused primarily on technology, to a design-centric perspective, focused primarily on the human beings who actually work with data. Only then will they be able to build effective analytical tools that take advantage of human visual and cognitive strengths and augment human weaknesses.
Borrowing another insight from Gladwell, I believe we are approaching the “tipping point” when people will no longer be fooled by analytical imitations and will begin to develop the skills and demand the tools that are needed to get the job done. Business intelligence vendors that fail to catch on or to turn their unwieldy ships in time will be left behind. The times are changing and so must they.
If you’re among the minority in the workforce today who understand analytics and are willing to tie your talents to good tools that utilize them fully, you are in for the ride of your life. As Hal Varian, University of California, Berkeley professor and current Chief Economist at Google, recently stated in an interview, “statistician” will become the sexy job of coming years, just as software engineers enjoyed that position for years, beginning in the 1980s. Evidence of this can already be discerned. Even in today’s depressed job market, graduates with degrees in statistics are in extremely high demand and are being rewarded with high salaries. You don’t need a Ph.D. in statistics to be a good data analyst, of course. You must, however, have the soul of an investigator and a flexible, analytical mind. You must be able to think critically. Daniel Pink made this case brilliantly in his book A Whole New Mind (2005). What I’m calling the “analytics age,” he called the “conceptual age.”
Our schools are not geared up to produce this kind of workforce, so if you’ve somehow managed to develop these skills, there’s a place of honor for you in the world that’s emerging. You’ll be appreciated in ways that were rare during those years when I worked in the corporate world. Won’t it be refreshing to actually be thanked when you reveal, through painstaking analysis, faults in your organization’s policies, practices, or assumptions, rather than being ignored or punished? Won’t it be nice to be rewarded when you save your organization millions of dollars by warning against a doomed decision rather than being demoted for speaking a politically unpopular truth? Won’t it feel good to prepare a well-reasoned case for a specific course of action and not have your hard work discarded in the blink of an eye by a manager who says “No, we’ll do it my way, because I’m the boss.” If your heart sings at these prospects, hold your head up and stay true; your day is coming.
Am I dreaming? Can a society and a workplace in which reason and evidence trumps whim and bias really emerge with enough strength to shift the balance? I hope so, but there’s no guarantee. I’m going to do everything I can to help usher it in. The opportunity is now. I don’t want to live in the sad, dumb, unjust society that is our future if this opportunity is missed.
Not long ago, BusinessWeek published a story titled “Data Visualization: Stories for the Information Age” by Maria Popova (self-described as a “digital anthropologist, cultural curator and semi-secret geek aggregating the world’s eclectic interestingness”). The article featured the work of Aaron Koblin of Google’s Creative Labs (self-described as an “artist/designer/programmer”). Popova and Koblin bring fresh perspectives to data visualization, but they are newcomers to the field and have both made statements about it that demonstrate their lack of experience.
Popova describes the field as follows:
Data visualization has nothing to do with pie charts and bar graphs. And it’s only marginally related to “infographics,” information design that tends to be about objectivity and clarification. Such representations simply offer another iteration of the dataâ€”restating it visually and making it easier to digest. Data visualization, on the other hand, is an interpretation, a different way to look at and think about data that often exposes complex patterns or correlations.
“Has nothing to do with pie charts and bar graphs”? I would gladly support any effort to dismiss pie charts (with a few exceptions), but the notion that bar graphs and other traditional displays of quantitative data have nothing to do with data visualization is just plain silly. No one who understands data visualization and has done any work in the field would make such a statement, nor would they go on to say that unlike quantitative graphs, “data visualization…is an interpretation, a different way to look at and think about data that often exposes complex pattern or correlations.” In truth, all visual representations of data are interpretations. The very act of selecting information and presenting it in a particular way is an act of interpretation. All forms of data visualizationâ€”whether traditional bar graphs or some of the newer animated displaysâ€”should be embraced if they bring data alive in clear, simple, and accurate ways to help us understand the stories that live therein. Let’s not be tempted to dismiss the tried and true in our excitement over the novelâ€”or what gives the impression of being novel.
Moving on to Koblin, much of data visualization that is novel technologically works on the same principles as bar graphs, scatterplots, and line graphs. For example, Koblin created an animated display of SMS messages throughout a single day in Amsterdam. The degree to which it provides insight into this information relies on the same basic mechanism that causes bar graphs to work: using objects of varying heights to display differences in quantitative values.
As an artist and technologist, Koblin has done some interesting work. I especially like his animated visualization of worldwide air traffic throughout the course of a day. It tells the high-level story of daily air traffic and works as an effective starting point for further exploration and analysis, which would then require more conventional visualizations. At Google, Koblin has an enviable opportunity to play with data. He has produced several visualizations that are fun and useful for artistic purposes, but fewer that actually present information in meaningful and useful ways. Of the three terms that he uses to describe himselfâ€”artist/designer/programmerâ€”self-expression as an artist and programmer appears to come through more frequently and strongly in Koblin’s work than the solutions of a designer. I certainly can’t fault him for the wonderful examples of self-expression that he’s created, but I feel that I must take issue with something he’s said about data visualization: “It’s not about clarifying data…it’s about contextualizing it.” Actually, it’s about both. Without clarity, which is sometimes lacking in Koblin’s visualizations, context can only take us so far.
In spite of its problems, I love the way that Popova concludes her article, with one minor exception:
Ultimately, data visualization is more than complex software or the prettying up of spreadsheets. It’s not innovation for the sake of innovation. It’s about the most ancient of social rituals: storytelling. It’s about telling the story locked in the data differently, more engagingly, in a way that draws us in, makes our eyes open a little wider and our jaw drop ever so slightly. And as we process it, it can sometimes change our perspective altogether.
I balk only at her statement that “it’s about telling the story locked in the data differently.” Differently than other forms of visual display that have worked in the pastâ€”like bar graphs, perhaps? To stress that the story must be told differently is to shift our assessment of what works and what’s needed to “innovation for the sake of innovation” to a degree that Popova herself warns against. As artists, programmers, and people from other disciplines and perspectives venture into the realm of data visualization, they will help us expand our horizons. Some will also be too quick to dismiss what they don’t understand as they approach data visualization, in Popova’s words, as “a new frontier of self-expression.” Those of us who have worked in the field for awhile must be open to new ideas; those who are newcomers must respect the work of those who have worked hard to make data visualization useful and effective. I welcome this collaboration.
People sometimes adopt existing terms and give them new meanings to suit their interests, leading to confusionâ€”sometimes usefully, when things need to be shaken up, and sometimes not. Perhaps Koblin and others like him have done this with the term “data visualization” and are using it to describe something related to but in many ways different from data visualization as I know and practice it. I suspect that another termâ€”perhaps “information art”â€”would better describe what Koblin and others who combine their artistic and technological interests are introducing. Manuel Lima, of the site Visual Complexity, recently made this point in a blog post titled “Information Visualization Manifesto.” Lima proceeded from there to thoughtfully and eloquently describe the characteristics that define data visualizationâ€”or more specifically “information visualization”â€”as distinct from other types of information display. His words are brilliant and timely. I heartily recommend that you read them.
I recently took a look at a new book from O’Reilly Media that’s a thoughtful introduction to the general concepts of data analysis. Unlike most books on data analysis, this is not software-specific and it does not focus on some complicated aspect of statistical or financial analysis. The book is Head First Data Analysis by Michael Milton. In its Head First series, O’Reilly is trying to provide books that introduce computer-related topics in a way that speaks to the way our brains learn. These books teach the material using real-world scenarios with lots of activities to engage us in thinking, rather than merely throwing a bunch of information at us and hoping that some will stick. Head First Data Analysis fills a useful gap at just the right time. Now that organizations are beginning to take the need for effective data analysis more seriously and people of all types are becoming responsible for the task, this book presents the basic concepts in a way that is practical and accessible to all.
Milton doesn’t go deeply into any of the concepts, but that’s by design. If you’re comfortable with statistics, this is not the book for you, but if you’re one of the many people who must make sense of quantitative information as part of your job and you’ve never been trained in data analysis, this book will set your feet on the path. Milton does a good job of identifying the basics that people need to get started and explaining them in simple ways that make them immediately useful. It’s a workbook of sorts, with exercises throughout. If you really want to learn, this is the way to do it.
I’m occasionally asked by journalists to describe actual cases when organizations have derived real tangible benefits from data visualization. When asked, I’m usually forced to answer in terms of generalities, for the following reasons:
- The nature of my work with clientsâ€”training and design consulting servicesâ€”rarely gives me a chance to see the results of my work.
- On those occasions when I am able to see work that a client produced as a result of my services, I am rarely allowed to share it publicly.
- People often contact me to say how much they appreciate my work, especially my books and articles, but they rarely share specific, tangible benefits.
- When people have shared specific accounts of benefits, I’ve remembered only the gist of those accounts, almost never the details.
Even though we have plenty of evidence from years of research to support the tremendous potential of data visualization, we are lacking in specific accounts that confirm beneficial outcomes in the real world, either empirically in the form of measured results or anecdotally.
I was reminded of this frustrating blind spot today when I received yet another interview request from a journalist and imagined myself explaining once again that, although there is a great deal of evidence that data visualization works based on perceptual studies, etc., I have little documented evidence that it works in practice.
I need real stories from you who use data visualization to analyze and present data. Has data visualization led you to important findings? Has data visualization helped your organization increase revenues or decrease costs? Has data visualization increased efficiency or productivity? Have good decisions been made because information was presented visually?
Help me out here. Tell me about your experiences. Be my eyes where I cannot go myself to observe.
In many respects, the “information age” is anything but. An overwhelming supply of data, powered by advances in technology that ignore the needs and abilities of humans, can do more harm than good. Because of what we’re learning through brain research, which has made great strides in the last decade, we now have an opportunity to do much better. Perhaps no one does a better job of explaining in broadly accessible terms what we now know about the human brain and how it works than developmental molecular biologist John Medina. What’s special about Medina’s work is that he isn’t just delivering the facts; he’s applying them in practical ways to improve our lives.
In Brain Rules: 12 Principles for Surviving and Thriving at Work, Home, and School, Medina takes us on a fascinating journey through the brain, expressing what research has revealed in the form of simple rules that we can follow to live smarter and better, and help others do the same. In the book’s introduction, Medina writes:
Most of us have no idea how our brain works.
This has strange consequences. We try to talk on our cell phones and drive at the same time, even though it is literally impossible for our brains to multitask when it comes to paying attention. We have created high-stress office environments, even though a stressed brain is significantly less productive. Our schools are designed so that most real learning has to occur at home. This would be funny if it weren’t so harmful. Blame it on the fact that brain scientists rarely have a conversation with teachers and business professionals, education majors and accountants, superintendents and CEOs. Unless you have the Journal of Neuroscience sitting on your coffee table, you’re out of the loop.
This book is meant to get you into the loop.
Our classrooms, workplaces, and homes are in many ways designed to thwart brain health, effective learning, personal fulfillment, and overall progress as a species. In many ways, the information technologies that dominate our lives today have contributed to this sad state of affairs, not because of anything inherently wrong with technology, but because most of it was developed without understanding how our brains work. As such, reliance on technology can actually make us unhappy and dumb. This is true of most business intelligence (BI) technology, which is my professional domain. All business intelligence professionals, especially those who develop BI tools, should read this book. You’ll enjoy the process, learn how to live a happier, smarter, and more productive life, and develop an understanding of the brain that will help you more effectively support the goals of business intelligence.
P.S. The website www.brainrules.net provides a great deal of information, including videos, which you can use to preview the book. Also, Garr Reynolds, the author of Presentation Zen, put together a wonderful slideshow that you can view to get the gist of the book, especially as it applies to presenters.
In March of 2006 I glimpsed the new charting capabilities of Excel 2007 for the first time and wrote about them in an article titled “Excel’s New Charting Engine: Preview of an Opportunity Missed.” After waiting for years to see how the world’s most popular data analysis software would improve its sadly lacking charting capabilities, I mourned the opportunity for improvement that was almost entirely missed. Essentially, an entirely new charting engine in Excel 2007 replaced the old one, but what it brought with it was a fresh array of flashy visual effects that encouraged us to hide our data behind a thick layer of cheap makeup. Within two days of my article’s publication, I received an email from Scott Ruble, the person in charge of charting functionality in Microsoft Office products. Scott invited me to help the team improve the charting capabilities of the product’s next major releaseâ€”Excel 2010â€”which will become available sometime during the first half of next year. We’ve had several conversations since, including a teleconference with the team. Early glimpses into the charting capabilities of Excel 2010 are now beginning to surface, and it appears that the opportunity to improve the product’s data visualization capabilities has once again been missed. Although I haven’t seen an advance version of the product myself, those who have tell me that it includes only one change to charting: the addition of sparklines. What a shame.
Don’t misunderstand me. I’m thrilled that a version of Tufte’s sparklines will be added. Assuming that the implementation is well designed, this will eliminate the need for an add-in product if you want to display a set of time-series values as a simple sparkline, but this is a single grain of sand compared to an entire seashore of need. No single product in the world is used more than Excel for analytics, not because it’s a good tool for data exploration, analysis, and presentationâ€”it isn’tâ€”but because almost everyone in the world who works with quantitative data has it. Just imagine how much the world would benefit if Excel were more powerful and better designed. I was frustrated and upset when Excel 2007 missed the mark, but now with Excel 2010 trying to assuage our misery with nothing but sparklines, I’m inclined to give up on the product entirely as a tools for data analysis. Fortunately, where Excel has failed, alternative products have emerged that deliver effective and visionary analytical abundance.
Will Microsoft play a role in the future of data analytics? Although the company boasts a business intelligence (BI) solution and even declared its commitment with the first annual Microsoft Business Intelligence Conference in May of 2007 (the 2009 conference was cancelled), only its database does anything particularly useful for BI so far. The other pieces that have been awkwardly rubber-banded together into a so-called BI solution suggest the lack of a strategy or a confused one at best, and previews of coming additions, such as Project Gemini, suggest nothing but already dated functionality for the future. I don’t have a bias against Microsoft because it’s huge and powerful; I have a progressively growing disappointment with it because of unfulfilled potential. If Microsoft seriously applied itself to the task, it could probably do some wonderful for the world of analytics. At this point, Microsoft will have to do something big, totally unexpected, and uncharacteristically well designed if it hopes to play a role in the future of analytics. I would welcome this with arms wide open, but I’m not holding my breath.