Wednesday, May 14, 2008

information visualization for the people

Nathan from FlowingData just blogged about a post I missed a couple weeks back by Todd at A Beautiful WWW concerning one of my favorite questions: Why isn't data visualization more popular?

Serendipitously, this was the central topic of my Comparative Media Studies Master's thesis, Information Visualization for the People, which I am happy to say was submitted and accepted this past Friday. Getting it written was also the reason for the utter lack of posts here for the last few months--hopefully its completion signifies an end to that dry spell. Although, in addition to figuring out life beyond grad school, I do have some serious Grand Theft Auto related business to attend to...

In any case, for those who are interested in checking it out, the thesis will be available permanently online at the CMS thesis website (along with the work of the other fine folks at CMS, covering a wide range of topics) once they get around to posting it. In the mean time, I've temporarily posted a PDF copy here, and I intend to "webify" it in a more mutable form at some point in the relatively near future. I welcome any and all criticism, as it represents a fairly specific take on the state of information visualization today, but draws on a lot of existing infovis research (including some that I've blogged about here in the past). I by no means consider it a definitive document, and parts of it are certainly weak, but I want to make it available in the hopes that it contributes to the popular discourse we are starting to see around these issues.

Here's the abstract:

The design of information visualization, defined as the interactive,
graphical presentation of data, is on the verge of a significant paradigm shift brought on by the continued maturation of the Information Age. Its traditional role as a scientific tool deployed by rigorous data analysts is in the process of expanding to include more mainstream uses and users, reflecting fundamental changes to the role of information and data in our increasingly digital society. However, visualization design theory remains rooted in earlier conceptions of its use, largely ignoring the needs of this new, non-expert audience. Accordingly, this thesis attempts to re-contextualize information visualization as a public-facing practice, and explores ways in which its design can shift from being described as “by experts, for experts” to a new characterization as “for the people.”

Many thanks to my thesis committee, Nick Montfort, Fernanda Viegas, and Martin Wattenberg, for giving me great advice and not laughing me out of the room after reading it!

Monday, February 25, 2008

more progressive visualizations from the NYT

I was just alerted to this appropriately-themed visualization posted on the New York Times website a couple of days ago. Once again, the NYT has taken a very progressive approach to their designs; this one is a variation of a stacked graph (popularized on the internet by Martin Wattenberg's Baby Name Voyager, though this implementation looks very similar to Lee Byron's visualizations of listening habits) showing the box office earnings of many of the major Hollywood movies released in the last 20 years. There are lots of interesting trends to discover in here, as the stacked graph is pretty good at illustrating "outliers" in the data. Even better, it is a very striking design that encourages you to explore it.

The NYT is clearly pushing at the boundaries of traditional newspaper infographics, which I think is fantastic. As usual, I find myself wondering what kind of response these visualizations generate from NYT readers. Do they find them more compelling? More readable? Or the opposite? This type of stacked graph, for instance, is a perfect example of a visual construction that flirts with semantic ambiguity just enough to potentially become confusing; I have recently been arguing in my thesis that the ways in which we attempt to interpret more elaborate examples of infovis today are based on the basic graph-reading skills we were taught in grade school (which, unfortunately, is about the extent of the visualization education most of us receive). We implicitly look for what Jacques Bertin described as "the x y z construction;" the relationship of visual marks to the principle axes of the image plane. In other words, "the independent variable goes on the X axis, and dependent variable goes on the Y axis." Or, as Matt Ericsson recently suggested, "time goes on the X axis, and anything else goes on the Y axis" (my quotation).

Some constructions, such as this type of stacked graph, cannot be understood quite so simply. While overall it is a fairly simple graphic, there are some significant ways in which its "marks" are ambiguously signified. For instance, in this implementation there is no literal X axis (ie. the horizontal line with the "x" next to it). If you were to try to understand it as a basic 2D plot, you might wonder where the "zero" is on the Y axis. Accordingly, do the curves that point downwards represent negative values? And what is the significance of the Y-ordering of the data? Of course, the first two (theoretical) questions are inappropriate for this type of plot, and answer to the third is that the significance is ambiguous or undefined. It is simply the nature of this type of "layout algorithm." But if you were trying to read this graphic in terms of the basic constructions we learned as kids, those questions might rightly stump you and turn you off of the whole thing. This is something I think we need to be aware of when designing infovis for a broad audience.

Obviously, that is by no means a criticism of this NYT piece. Exactly the opposite: it won't be until visualization education is advanced beyond what was probably taught 100 years ago (that'll be the day), or more modern examples of infovis proliferate in the "public consciousness," that infovis will be seen as a viable mainstream medium. The NYT is certainly pushing towards the latter goal! Big ups to that!

More discussion on Nathan's blog, FlowingData.

PS. Does anyone who reads the NYT website more regularly than I do know how to get these visualizations as an RSS feed? I have subscribed to their "multimedia" feed, but all I get are photographs. I'm missing quality stuff here!

Monday, February 11, 2008



Via infosthetics this morning I discovered what looks like a very promising new news aggregator/search site called Silobreaker.

These guys are basically mining a very hefty set of news outlets (supposedly over 10,000 (!!!) from around the globe) to build relational models between events, people, etc., and give context to any particular news item (hence the name, I guess). In other words, “semantic web” type stuff. Of particular interest is their use of various visualization tools, front and center, to organize all this information. There is a “hotspot” map showing the locations of popular stories, a network graph indicating relationships between people and places mentioned in any particular news item, and a bunch of trend graphs measuring various interesting metrics.

Obviously, what I find most compelling about the site is that it presents a perfect application of information visualization in a context (ie. reading the news) that is relevant to almost everyone who surfs the web. In this case, the particular types of visualizations being used really seem critical to exploration of the site's content, rather than functioning as a side-bar novelty item. And, from my admittedly quick investigation, they appear to work well as exploratory tools -- I'm curious to see how users respond to them. From the design perspective, I think they are, for the most part, simple, effective, and informative; very reminscent of the graphics on Many Eyes. The filtering interface for the network graph could be more user friendly, and I can't find a "help" button anywhere that explains how to read a network graph (which I would argue is pretty important, especially given their potential complexity and that their layout can change depending on filtering options), but it offers a nice overview of relationships with more info to "drill down" in to via mouse-over. One element I particularly like (maybe because I'm writing about it in my thesis right now!) is the use of icons to identify the categories of nodes in the graph. With a quick glance you can easily digest the relationships between people, organizations, cities, etc.. It would be really nice, although I don't know if it is "semantically" possible, if there was way to similarly identify categories in the connections between nodes, to suggest the nature of the relationship without having to drill down and try to extract that from the detailed data.

Visualizations aside, what I find promising about this site from a more "high-concept" perspective is what they describe on their "About" page: the notion of presenting specific news stories within a larger contextual web. News media bias and the "echo-chamber effect" have been hot issues the last few years, and the sort of approach Silobreaker is attempting is, at least on the surface, a great way to deal with it. Particularly when using the visualizations, what you get from a story that you read on Silobreaker is potentially more than just the perpective of the media outlet that contributed the article. This is something that strikes me as tremendously important; as we become more deluged with information, we run the risk of walling ourselves in to "silos" without a clear understanding of the big picture (for a related take on this issue also involving visualization, see Ethan Zuckerman's Global Attention Profiles project at the Berkman Center for Internet and Society at Harvard Law School from a few years back).

This is definitely a visualization site to keep an eye on. Are any readers already using Silobreaker? What do you think about it? Will it prove useful and compelling, or will it go the way of CNET's similar attempt at incorporating a semantic network graph in to their news reporting?

PS. Please excuse my lack of posts lately on account of my being deep in the thesis writing process!

Monday, January 14, 2008

more on NYT graphics

After a long holiday absence, it's time to start posting regularly again!

Continuing with the New York Times theme from my last post (almost a month ago!), I just caught this podcast at the User Interface Engineering website, called Making Data Engaging: A Talk with the New York Times Interactive Design Team. As the title suggests, it's an interview with Andrew DeVigal and Steve Duenes, who are part of the NYT's design team (Along with, among others, Matt Ericson, who's been mentioned here before) that develops the interactive graphics that appear on the NYT website.

Though there isn't a huge amount of content concerning visualization design explicitly, they talk a lot about the development of some recent projects, including the nice NYTimes Debate Analyzer. If you don't want to listen to the whole thing, there is some interesting commentary towards the end (starting at about 16:16) on their influences, which include 1940's magazine infographics, current advertising campaigns, and iTunes. Also, incidentally, I found the echange over their Trailer Living, Then and Now visualization (20:10) pretty interesting as an example of how what I would consider "vernacular visualization" (and they perhaps consider "cheesiness") can make for compelling designs.