I was just alerted to this appropriately-themed visualization posted on the New York Times website a couple of days ago. Once again, the NYT has taken a very progressive approach to their designs; this one is a variation of a stacked graph (popularized on the internet by Martin Wattenberg's Baby Name Voyager, though this implementation looks very similar to Lee Byron's visualizations of Last.fm listening habits) showing the box office earnings of many of the major Hollywood movies released in the last 20 years. There are lots of interesting trends to discover in here, as the stacked graph is pretty good at illustrating "outliers" in the data. Even better, it is a very striking design that encourages you to explore it.
The NYT is clearly pushing at the boundaries of traditional newspaper infographics, which I think is fantastic. As usual, I find myself wondering what kind of response these visualizations generate from NYT readers. Do they find them more compelling? More readable? Or the opposite? This type of stacked graph, for instance, is a perfect example of a visual construction that flirts with semantic ambiguity just enough to potentially become confusing; I have recently been arguing in my thesis that the ways in which we attempt to interpret more elaborate examples of infovis today are based on the basic graph-reading skills we were taught in grade school (which, unfortunately, is about the extent of the visualization education most of us receive). We implicitly look for what Jacques Bertin described as "the x y z construction;" the relationship of visual marks to the principle axes of the image plane. In other words, "the independent variable goes on the X axis, and dependent variable goes on the Y axis." Or, as Matt Ericsson recently suggested, "time goes on the X axis, and anything else goes on the Y axis" (my quotation).
Some constructions, such as this type of stacked graph, cannot be understood quite so simply. While overall it is a fairly simple graphic, there are some significant ways in which its "marks" are ambiguously signified. For instance, in this implementation there is no literal X axis (ie. the horizontal line with the "x" next to it). If you were to try to understand it as a basic 2D plot, you might wonder where the "zero" is on the Y axis. Accordingly, do the curves that point downwards represent negative values? And what is the significance of the Y-ordering of the data? Of course, the first two (theoretical) questions are inappropriate for this type of plot, and answer to the third is that the significance is ambiguous or undefined. It is simply the nature of this type of "layout algorithm." But if you were trying to read this graphic in terms of the basic constructions we learned as kids, those questions might rightly stump you and turn you off of the whole thing. This is something I think we need to be aware of when designing infovis for a broad audience.
Obviously, that is by no means a criticism of this NYT piece. Exactly the opposite: it won't be until visualization education is advanced beyond what was probably taught 100 years ago (that'll be the day), or more modern examples of infovis proliferate in the "public consciousness," that infovis will be seen as a viable mainstream medium. The NYT is certainly pushing towards the latter goal! Big ups to that!
More discussion on Nathan's blog, FlowingData.
PS. Does anyone who reads the NYT website more regularly than I do know how to get these visualizations as an RSS feed? I have subscribed to their "multimedia" feed, but all I get are photographs. I'm missing quality stuff here!