Wednesday, May 14, 2008

information visualization for the people

Nathan from FlowingData just blogged about a post I missed a couple weeks back by Todd at A Beautiful WWW concerning one of my favorite questions: Why isn't data visualization more popular?

Serendipitously, this was the central topic of my Comparative Media Studies Master's thesis, Information Visualization for the People, which I am happy to say was submitted and accepted this past Friday. Getting it written was also the reason for the utter lack of posts here for the last few months--hopefully its completion signifies an end to that dry spell. Although, in addition to figuring out life beyond grad school, I do have some serious Grand Theft Auto related business to attend to...

In any case, for those who are interested in checking it out, the thesis will be available permanently online at the CMS thesis website (along with the work of the other fine folks at CMS, covering a wide range of topics) once they get around to posting it. In the mean time, I've temporarily posted a PDF copy here, and I intend to "webify" it in a more mutable form at some point in the relatively near future. I welcome any and all criticism, as it represents a fairly specific take on the state of information visualization today, but draws on a lot of existing infovis research (including some that I've blogged about here in the past). I by no means consider it a definitive document, and parts of it are certainly weak, but I want to make it available in the hopes that it contributes to the popular discourse we are starting to see around these issues.

Here's the abstract:

The design of information visualization, defined as the interactive,
graphical presentation of data, is on the verge of a significant paradigm shift brought on by the continued maturation of the Information Age. Its traditional role as a scientific tool deployed by rigorous data analysts is in the process of expanding to include more mainstream uses and users, reflecting fundamental changes to the role of information and data in our increasingly digital society. However, visualization design theory remains rooted in earlier conceptions of its use, largely ignoring the needs of this new, non-expert audience. Accordingly, this thesis attempts to re-contextualize information visualization as a public-facing practice, and explores ways in which its design can shift from being described as “by experts, for experts” to a new characterization as “for the people.”

Many thanks to my thesis committee, Nick Montfort, Fernanda Viegas, and Martin Wattenberg, for giving me great advice and not laughing me out of the room after reading it!

Monday, February 25, 2008

more progressive visualizations from the NYT

I was just alerted to this appropriately-themed visualization posted on the New York Times website a couple of days ago. Once again, the NYT has taken a very progressive approach to their designs; this one is a variation of a stacked graph (popularized on the internet by Martin Wattenberg's Baby Name Voyager, though this implementation looks very similar to Lee Byron's visualizations of listening habits) showing the box office earnings of many of the major Hollywood movies released in the last 20 years. There are lots of interesting trends to discover in here, as the stacked graph is pretty good at illustrating "outliers" in the data. Even better, it is a very striking design that encourages you to explore it.

The NYT is clearly pushing at the boundaries of traditional newspaper infographics, which I think is fantastic. As usual, I find myself wondering what kind of response these visualizations generate from NYT readers. Do they find them more compelling? More readable? Or the opposite? This type of stacked graph, for instance, is a perfect example of a visual construction that flirts with semantic ambiguity just enough to potentially become confusing; I have recently been arguing in my thesis that the ways in which we attempt to interpret more elaborate examples of infovis today are based on the basic graph-reading skills we were taught in grade school (which, unfortunately, is about the extent of the visualization education most of us receive). We implicitly look for what Jacques Bertin described as "the x y z construction;" the relationship of visual marks to the principle axes of the image plane. In other words, "the independent variable goes on the X axis, and dependent variable goes on the Y axis." Or, as Matt Ericsson recently suggested, "time goes on the X axis, and anything else goes on the Y axis" (my quotation).

Some constructions, such as this type of stacked graph, cannot be understood quite so simply. While overall it is a fairly simple graphic, there are some significant ways in which its "marks" are ambiguously signified. For instance, in this implementation there is no literal X axis (ie. the horizontal line with the "x" next to it). If you were to try to understand it as a basic 2D plot, you might wonder where the "zero" is on the Y axis. Accordingly, do the curves that point downwards represent negative values? And what is the significance of the Y-ordering of the data? Of course, the first two (theoretical) questions are inappropriate for this type of plot, and answer to the third is that the significance is ambiguous or undefined. It is simply the nature of this type of "layout algorithm." But if you were trying to read this graphic in terms of the basic constructions we learned as kids, those questions might rightly stump you and turn you off of the whole thing. This is something I think we need to be aware of when designing infovis for a broad audience.

Obviously, that is by no means a criticism of this NYT piece. Exactly the opposite: it won't be until visualization education is advanced beyond what was probably taught 100 years ago (that'll be the day), or more modern examples of infovis proliferate in the "public consciousness," that infovis will be seen as a viable mainstream medium. The NYT is certainly pushing towards the latter goal! Big ups to that!

More discussion on Nathan's blog, FlowingData.

PS. Does anyone who reads the NYT website more regularly than I do know how to get these visualizations as an RSS feed? I have subscribed to their "multimedia" feed, but all I get are photographs. I'm missing quality stuff here!

Monday, February 11, 2008



Via infosthetics this morning I discovered what looks like a very promising new news aggregator/search site called Silobreaker.

These guys are basically mining a very hefty set of news outlets (supposedly over 10,000 (!!!) from around the globe) to build relational models between events, people, etc., and give context to any particular news item (hence the name, I guess). In other words, “semantic web” type stuff. Of particular interest is their use of various visualization tools, front and center, to organize all this information. There is a “hotspot” map showing the locations of popular stories, a network graph indicating relationships between people and places mentioned in any particular news item, and a bunch of trend graphs measuring various interesting metrics.

Obviously, what I find most compelling about the site is that it presents a perfect application of information visualization in a context (ie. reading the news) that is relevant to almost everyone who surfs the web. In this case, the particular types of visualizations being used really seem critical to exploration of the site's content, rather than functioning as a side-bar novelty item. And, from my admittedly quick investigation, they appear to work well as exploratory tools -- I'm curious to see how users respond to them. From the design perspective, I think they are, for the most part, simple, effective, and informative; very reminscent of the graphics on Many Eyes. The filtering interface for the network graph could be more user friendly, and I can't find a "help" button anywhere that explains how to read a network graph (which I would argue is pretty important, especially given their potential complexity and that their layout can change depending on filtering options), but it offers a nice overview of relationships with more info to "drill down" in to via mouse-over. One element I particularly like (maybe because I'm writing about it in my thesis right now!) is the use of icons to identify the categories of nodes in the graph. With a quick glance you can easily digest the relationships between people, organizations, cities, etc.. It would be really nice, although I don't know if it is "semantically" possible, if there was way to similarly identify categories in the connections between nodes, to suggest the nature of the relationship without having to drill down and try to extract that from the detailed data.

Visualizations aside, what I find promising about this site from a more "high-concept" perspective is what they describe on their "About" page: the notion of presenting specific news stories within a larger contextual web. News media bias and the "echo-chamber effect" have been hot issues the last few years, and the sort of approach Silobreaker is attempting is, at least on the surface, a great way to deal with it. Particularly when using the visualizations, what you get from a story that you read on Silobreaker is potentially more than just the perpective of the media outlet that contributed the article. This is something that strikes me as tremendously important; as we become more deluged with information, we run the risk of walling ourselves in to "silos" without a clear understanding of the big picture (for a related take on this issue also involving visualization, see Ethan Zuckerman's Global Attention Profiles project at the Berkman Center for Internet and Society at Harvard Law School from a few years back).

This is definitely a visualization site to keep an eye on. Are any readers already using Silobreaker? What do you think about it? Will it prove useful and compelling, or will it go the way of CNET's similar attempt at incorporating a semantic network graph in to their news reporting?

PS. Please excuse my lack of posts lately on account of my being deep in the thesis writing process!

Monday, January 14, 2008

more on NYT graphics

After a long holiday absence, it's time to start posting regularly again!

Continuing with the New York Times theme from my last post (almost a month ago!), I just caught this podcast at the User Interface Engineering website, called Making Data Engaging: A Talk with the New York Times Interactive Design Team. As the title suggests, it's an interview with Andrew DeVigal and Steve Duenes, who are part of the NYT's design team (Along with, among others, Matt Ericson, who's been mentioned here before) that develops the interactive graphics that appear on the NYT website.

Though there isn't a huge amount of content concerning visualization design explicitly, they talk a lot about the development of some recent projects, including the nice NYTimes Debate Analyzer. If you don't want to listen to the whole thing, there is some interesting commentary towards the end (starting at about 16:16) on their influences, which include 1940's magazine infographics, current advertising campaigns, and iTunes. Also, incidentally, I found the echange over their Trailer Living, Then and Now visualization (20:10) pretty interesting as an example of how what I would consider "vernacular visualization" (and they perhaps consider "cheesiness") can make for compelling designs.

Thursday, December 20, 2007

the NYT takes it up a notch

The New York Times recently posted this impressive interactive visualization showing the degree to which each of the presidential candidates are mentioning one another in their debates. The NYT graphics team has certainly been ahead of the curve in terms of producing readable (and beautiful) infovis, but this one strikes me as a step above their usual (great) stuff. A network diagram like this is a considerably more abstract visual encoding than a bar chart or line graph, so I'm surprised and pleased to see it being deployed in such a high-traffic context.

Matt Ericson, the head of their graphics department, has talked about the challenges they face in designing "infovis for the masses;" I wonder if broadening their visualization repertoire with examples like this represents an increased confidence in the "visualization literacy" of their readers. I also wonder if they work more experimentally with the graphics they produce for the NYT website versus the print edition, and if this reflects a perceived difference in the "visualization literacy" of those respective audiences. Assuming it could be reworked as a static image, would they feel confident deploying this visualization in the print edition?

Either way, this is a fantastic example of effective public-facing visualization.

Monday, December 17, 2007

the most wonderful time of the year

As the lack of recent updates suggests, it has been a very busy couple of weeks for me on account of the end of the semester finally arriving. On the plus side, I kicked out a draft of a thesis chapter and finished various other projects, so it was time well spent. I hope to start posting here more frequently again, but I'm also looking forward to the holiday break!

Anyways, I don't usually like to advertise my own visualization work here, but I just posted the final project for a course I was taking this semester, Media in Transition, that people might find interesting. It's a prototype visualization of a set of Incan artifacts called "khipu," as cataloged by the Khipu Database Project at Harvard. Without going in to too much detail, the khipu take the form of hierarchically knotted strings used to encode information (so, arguably early information visualization!), though the meanings of these encodings remain largely undeciphered. I thought it would be interesting to prototype a visualization of the collection to suggest the value of infovis in facilitating exploration and analysis of the (somewhat unconventional) data set. You can check out the project page and applet here. More information about the khipu can be found at the Khipu Database Project page, and of course on Wikipedia. There's also an interesting article on them on from a few months ago.

Happy holidays!

Wednesday, November 28, 2007

the usability of YouTube

I just came across this short paper on the usability of YouTube that I thought was interesting given the recent discussion here and on Stephen Few's blog. The authors point out that YouTube is a hugely successful site despite sporting an interface design that apparently doesn't respect many conventional usability heuristics. By considering what users enjoy about the site, and what keeps them coming back, they suggest that these traditional evaluation methods may need to be redefined; among other things, "engagement" and "playfulness," two terms we've been throwing around here lately, are becoming increasingly important.

While I won't argue that YouTube serves the same purpose as information visualization (particularly if its "apparently bad design" is intentional, as the paper suggests), what I find most poignant about this article is how it identifies the "new" Web 2.0-enabled user, a digital native that grew up with the web, as having a heightened literacy for the internet and its technologies. This new user interacts with information differently and more fluently than users of the past, so the design of interfaces for them should reflect this -- a sentiment that certainly applies to information visualization design as well.