649week07 Organizing Representation Characteristics

3 Comments

You may be internalizing some of the readings on representation last week. As you develop your own framework for thinking about information representations, it may be helpful to think of these examples of developing frameworks to organize representation characteristics. You have to develop your own, and I am not recommending these, just using them as discussion aids.

choosing-chart

The above example, found here, starts with four objectives, comparisons, relationships, distributions, and composition. From there, the chartmaker divides data, then settles on formalisms. Note that a given formalism may show up in more than one destination: the scatterplot shows up under relationships and distribution. Note also that the data here is always presumed to be what Agresti (2002) would call interval variables. Finally, only a few formalisms are shown. For instance, relationship could be extended with scatterplot matrices to show relationships between more than three variables.
periodic-table

A more elaborate framework is shown above. The Periodic Table of Visualization shows about a hundred methods on a display that borrows its style from the Periodic Table of Elements. When your mouse hovers over any given square of the table, an example pops up. The above snapshot was taken while my mouse hovered over the semantic networks square. As your thinking about information representation crystallizes, I expect you to value a framework like this more and more. It probably took a lot more effort and study to create than the previous example, and uses definitions and examples from literature. It includes interface features allowing a more compact overview (context) with a dynamic focus. What else can you say about this framework?

microarray

Let me just briefly comment on the inspiration that people took from Eytan’s cellular automaton. Here’s a DNA microarray, a relatively new tool for visualizing gene expression under different conditions. This tool was developed for a very particular kind of scientific visualization, but can easily be repurposed. The original use is to observe the degree of gene expression under different conditions. In this example, there are four possible effects, one each shown by red, green, yellow, and black. In addition, each of these four effects can be weak or strong. As with Eytan’s cellular automaton, it’s easy to imagine repurposing this in any system that records a large number of measurements when you know that each measurement can take on only a few values in a few categories.
samsung-menu

Finally, here’s an example of a hierarchical menu from a recent a recent trade show. It should be clear that this is the same formalism for tree display as used in the Apple Finder Column View. It should be clear that this is a formalism that solves the problem of trying to depict a large tree in a (potentially small) resizable rectangle. Some elements in this picture are not essential to that mission, such as the curves at each level. There are also some possible additions you could consider. For example, if you have many entries at one level, you could add a way to represent how much of that level is hidden. With the Apple Finder Column View, that is accomplished by making the length of the scroll bar proportional to the hidden portion. What other ways can you think of to accomplish this?

649week06 Salary Performance Round 2 - After Action Report

0 Comments

big-canvas

One of our papers this week discussed multiscale displays, addressing the problem arising when some features of the display need to be represented at different scales to perceive the detail in each feature. The above solution reminded me of this problem because of the attempt to balance detail about each team with the large number of teams and the need for a lot of space if each team’s path is represented by very many pixels. The five teams whose paths are shown here hint at this problem. They also suggest a preview of thinking about interaction. We might try to imagine the smallest number of interactions needed to provide an interesting picture of, say, historical rivals, geographic rivals, popular rivals, likely playoff pairings, or other configurations that may prove popular. When planning a shortcut, it’s always tempting to think of the steady state. What I mean is that, once the system is popular, it will be easy to determine what people want to see. What about that time before the system is popular (the only time most systems ever see!)? How do you make really good guesses before you have much data? How do you make the most of minimal data? How do you balance providing enough shortcuts for experienced users with providing a non-bewildering interface for first-timers.

cellular-automaton

Inspiration came to one team in the form of a cellular automaton discussed earlier in class. This raises, among other questions, where the inspiration meets the structure of information you want to represent. The cellular automaton we saw provided a convenient way to represent many specific points representing several nominal values. How does that fit into a configuration where we’d like to represent binary values (individual wins and losses) and values on two or three scales (comparative won/loss, payroll, and a hybrid of the two).

Color might work for comparisons between teams, particularly if we can find hot and cold colors to represent magnitude in the performance or salary domains. There’s room for a lot more exploration of this first step.

recording-divergence

The tension between the need to diverge and the need to converge plays some role in every solution to a design problem. It looks like this group spent most of its time diverging, exploring the information and possible representations and interactions. Divergence strikes me as most useful when you can later review it and activate what you learned during a converging period. Does this display document what you did so that you can later re-engage? Can you explain this portion of the display to a casual observer? (By the way, I only snapped the right side of the board—on the left you can see the edge of their final design.)

animated-avatar

One group described an animated avatar of a batter, sending a baseball along a path corresponding to some variable of interest. This kind of thing seems to rely a lot on execution. Would the batter be skinnable? Would you try for photorealism? Would you be able to cross the uncanny valley if you did so? How tightly would the batting integrate with the graph of the path? If I wanted to review a team’s performance, would I need to see another instance of batting? Who is likely to want to see batting over and over again?

Apart from execution is the question of how attached fans are to particular players. I’ve known people to admit that they play fantasy football, but I don’t know if there is an analogous phenomenon in baseball.
geo-perspective

The representation above may be meant more to provoke discussion than to lead to implementation. After all, it directly challenges many of the guidelines we reviewed in our discussion of the normative perspective. Tufte even includes a similar map, decrying it with his familiar pejorative, chartjunk. One thing I like about this design is that it’s very carefully developed, so that if I want to challenge it, I can do so with more precision than I could with the less-developed sketches.

The main issue of interest is the choice of a map of states to represent 30 teams, many of which carry the names of cities rather than states and at least one of which is situated in Canada, outside the map area. A secondary issue is the choice of making the states “pop up” from the surface of the map in proportion to the records of the teams. The area thus extruded would be a function of the both the team’s record and the area bounded by states. In the case of states with more than one team, a tectonic plate scheme is demonstrated for the case of New York. This state is interesting because both teams are situated in the southeastern corner and may have nothing whatever to do with fans in most of the geographic area depicted. This problem was recently discussed on the website fivethirtyeight when Nate Silver tried to make a map of the new Congress. He wanted to maintain fidelity with two constructs he valued highly: the shapes of the states and their borders with their neighbors. Since most New Yorkers live in the Congressional districts in the southest corner of the state, it’s not easy to show representation (based on total population) and maintain shape and borders. Still, Silver’s map represents a thoughtful attempt. (See more here.)

(picture links to its creator's blog post about it)

(picture links to its creator's blog post about it)

Another challenge this representation faces is how to depict the payroll. The choice shown differs a little from the accompanying comments. Piggy banks are mentioned but what we can see is a series of dollar signs associated with each team. Note that the small boxes framing the dollar signs are of uniform size. The dollar signs themselves are so small that I found myself having to count them to compare the Mets to the Texas team. So in placement, size, and framing, I am discouraged from making comparisons.

Finally, let me comment on the “unreadability zone” chart at the bottom of the display. This chart purports to measure abstraction and complexity and to show a smooth curve between them. I don’t know what a chart of abstraction vs. complexity would look like, but I have no reason to believe it would be a smooth curve. I have absolutely no reason to believe that increasing abstraction and increasing complexity decreases readability. To convince me of such claims would require some kind of operational definitions and some kind of plausible test procedure to verify or falsify the claimed relationships. Casual perusal of science, and especially heatmaps of dna sequencing, convince me that quite readable displays (even for the lay person) are practical for some of the most abstract and complex data we can conceptualize. Nevertheless, I’m glad that this chart was attempted because you can reflect in a more articulate way by giving concrete expression to your ideas!

what-circles-mean

The above design really impressed me, but rather than praising it in class, I tried to highlight what might be problematic about it and suggest how you might investigate that. I did this by asking people questions about the display and I believe that the answers to these questions highlighted the deep-rooted tendency to believe that a circle represents turning or sequential representation. Some people did not immediately get that the pie slices were teams. Also, the spokes of differing lengths may have suggested turning, imbalance, or an intermediate state between resting states. You can work with or against these initial impressions, but you can’t will them away.
using-every-inch

A couple of groups used the entire whiteboard. The above representation in particular used every inch of available space. This was also a very fully realized design, at the convergence end of the spectrum of time allocated to converging and diverging. The discussion of this design included some confusion about centrifugal force. I remember that someone said that centrifugal force would tend to push things to the outside of a turning wheel and that that would suggest that the winners would gravitate toward the inside as the hardest place to reach. This reminds me of the carnival ride where children sit on a spinning disk. As it spins faster, it becomes harder to hold on, but especially harder if you slip at all and go near the edge. The winner of this ride is the one who can stay closest to the center for the longest time.

This representation provides more area to show the losing teams if the spokes are marked off in equal lengths for each win or loss, but everything about the representation: the spokes, the wheel, the spiral of paths, all these things drive the eye toward the center, toward the team with the best record. So it seems like a fair trade-off to provide more space to the less successful teams but to place the more successful in the exact location to which the eye naturally moves. In any event, you can certainly divide the space up like a hyperbolic plane or with some other configuration to give more space to the items closest to the center.
under-the-sea

The above picture elaborates on a similar representation from last week. How does it differ? The metaphor of sea level, reinforced with bubbles, fish, seaweed, and a giant clam (nice touch!) uses very little ink to make the point, a sign of strong draftsmanship. The fulcrum and lever portrayal of money and ranking is well-executed and drew a lot of attention during our discussion. One question that came up was whether the depiction reinforces or disputes our prejudices about the difference between money wasted and money well-spent. As shown, the wasted money has little or no weight, as if it’s ineffectual. Some people imagine wasted money as dragging down the team that wastes it, as a burden instead of a resource. It might be worthwhile to show this picture to a series of users and ask them what it means.

whole-world-plus-detail

The final picture, above, adds something I didn’t notice in any other picture: the state of the world. The vertical line beneath the date July 8 becomes narrower as frugal teams win more and wider as spendthrift teams win more. This is an extra piece of information no one else (as far as I know) thought to represent. It’s interesting and easy to perceive. Well done!

There are some other things to praise about this depiction. One is that the definition of effectiveness is unambiguous because of the exceptionally clear presentation of the numerical expressions at the top of the display. There is also the reminder in the upper right that it would be easy to provide controls for this design, sorting the bands in different orders to allow more direct comparisons between different teams or sets of teams. The ability to portray differences is also spotlighted in the simple, effective use of crosshatching for some of the bands. Finally, it’s an economical use of ink, with just three effective arrows on the entire display.

649week06 Information Representation in Your Projects

2 Comments

I’d like to continue with our discussion last week about the application of ideas in readings to your projects. In that vein, I’d like to review the user / information material you’ve already shared and think about how that informs your next steps. You have a lo-fi prototype due next week. This, as those of you who know me are aware, can be done in ball-point pen on torn notebook paper. The fidelity of the prototype to your vision is not an issue. What is at issue is exactly what we’re discussing today, information representation. Later, you’ll have the opportunity to fully analyze the role of information representation in your project, but you have to start somewhere. You can present a straw man next week as a focal point for your further work.

ARMuseum

Museums are an endangered species. When we look at the museums that have been really successful in engaging visitors, they work on multiple fronts to popularize their offerings. One very successful museum has communicated metrics to me including “time spent engaging with an exhibit” as the most prominent metric. One question that pops into my mind is whether there’s anything you can do that will make people revisit an exhibit or engage in storytelling about an exhibit that will lead others to that exhibit. I personally don’t believe in the “saving favorites” bunny. My reason is that “saving favorites” in my experience has always meant a placeholder and placeholders always look alike. The tab visualization project is an example of trying to create a memorable object about what has hitherto been a placeholder. Therefore, instead of optimizing the placeholderness, I would work toward creating a memorable link to an experience.

I would seek to minimize interaction (in contrast to other projects). You’ll have more opportunities to think about interaction later on anyway and you may find that a little accelerometer control will be sufficient. One concern that museum people have about portable devices is that whether they will consume too much attention. You can gain the trust of these people by deemphasizing interaction.

One thing to highlight about your “functionality brainstorming” is your idea of a user-driven recommendation system. Just because I am the only person standing in front of an exhibit at a given moment, doesn’t mean I can’t be part of a crowd around that exhibit over time. How can you portray the entire crowd around an exhibit? I urge you to represent information that would be easy to collect. If your recommendations require a high intensity of interaction, entering them may drag attention away from the exhibit. Suppose, just as an example, you were to portray “hot spots” where the most people stood longest. Or suppose you were to collect dwelling-time information and load a slideshow of the visit, based on that dwelling time information at exit.

Tacoma Crime

Here the information representation issues seem to me to be able to help with some of the underlying information problems. The two that seem to loom largest are jurisdiction and geography. Different jurisdictions have different vocabularies and standards for measurement. The FBI tries to standardize practices nationwide, but these efforts are directed toward local police departments rather than citizens. Cash-strapped police departments find their resources stretched to accomplish their mission, of which informing the general public is only part. You may find that you have a competitive advantage in summarizing information across jurisdictions in that you don’t have to privilege any particular agency’s perspective.

The geographic issue is mainly that of points and regions. Crimes usually happen at very specific locations, but are reported as part of regions. This creates a representation problem. An important issue for police is gang activity and gangs are generally perceived as regional. Both representation of crime and gangs can be done by cloropleth maps or heat maps. Both of these representations make statistical assumptions. You have to think about your users and their use cases to inform these assumptions.

Change We Can Visualize

Information overload is central to this project. Nate Silver’s website demonstrates how vastly much publicly available information can be brought to bear on questions about Congress in general.

A serious problem for this project is the vast amount of unstructured relevant information. As an example, consider the news stories about the automaker bailout. Opposition crystallized in right-to-work states with Japanese auto factories. The relevant information to understand the dynamics of the arguments may be tabulated somewhere, but it’s mainly available in narrative form. Further, an important dynamic in this debate was the reliance of the right-to-work state factories on suppliers who would be undone by the collapse of the Big Three. This reliance led manufacturers in those states to signal that opposition to the bailout might adversely affect the constituents of the leaders of the opposition. How can you represent a story like this?

You may want to start by selecting a story and representing all the information in it in a single design. You might be able to develop and refine a model for giving an illiterate or semi-literate population access to information that otherwise would be available only to those we very high reading skilled individuals.

You may want to, at least at first, work on this from two fronts. One would be the representation of stories in a graphical form and the other would be the creation of hooks that let a constituent focus on preferred issues. The guidance given by some activists after the 2004 presidential election was that those dissatisfied by the outcome should select their top issue and join a group tracking that issue. Can you imagine how you could parse information so that any user could once specify a top issue and see its position on repeated visits?

TabViz

As I mentioned above, there is an opportunity here for linking to a memorable experience. You have started with some really great ideas in a different direction and I don’t want to take anything away from that great work. But it would be great to evoke the prior browsing experience in tabs. The feature in iPhoto for browsing folders and iMovie for browsing clips both come to mind, as does the thumbnail slideshow displayed at the Moving Image Archive (http://www.archive.org/details/movies).

People who study consumer behavior have formulas to make predictions. These formulas are often based on a weighted linear combination of three numbers, representing recency, frequency, and amount of purchases. You could easily cast visits to sites in this same way, collect this information and present tabs on this basis. Early on, you found that gmail typically occupied the left-most tab. This may have been an expression of a more general phenomenon, one that can be expressed by a formula like a recency / frequency / amount formula, that can be applied to tabs in general.

Buckets of Rain

Chuck Workman created several short films, such as 100 Years at the Movies (1994) with the theme of presenting a great deal of information about Hollywood in a very short period of time. In the above example, Workman displayed images from 225 motion pictures in about 9 minutes. It might be worthwhile to watch this and record your reactions (or the reactions of potential users).

One challenge we face in information visualization is the role of audio. If the central problem is to portray a person’s movie-goer identity, audio may figure prominently. Is that a visualization issue? Walter Murch says that sound is half of the movie-watching experience in an interview here.

Visualizing the CCEL

The Christian Classics Ethereal Library has some hallmarks of a typical network problem. One representation that immediately pops to mind is the Similar Diversity project, pictured at

(picture links to the similar diversity project)

(picture links to the similar diversity project)

or the visualization of words in conversation by Natalia Rojas at http://www.nataliarojas.com/p55/msn_history/. The difference between these two is that the second provides two completely separate representations, a cylinder and a strip, while the first tries to locate texts within one representation, so as not to privilege one over another. The user study in this case suggests a very clear consensus about the pecking order of the texts and the importance of age as a characteristic in that order.

OCWViz

Like the previous example, this project, visualizing learning paths through open course ware materials, may require two separate artifacts, one about the material avaialable and one about the paths being followed through that material. One question I’d like to pose is whether there is skeletal outline for learning such that the following scenario could be played out. A student searches for Pareto and browses through a number of hits on Pareto distributions and Pareto optimality, finally settling on a complete read through of “Power Laws, Pareto distributions, and Zipf’s Law”, an article by MEJ Newman assigned in Week 4 of SI708. After reading this article, the student wants to dig deeper. As you may imagine, that article could fit into a number of learning paths. There are places on the paths before the article where students would learn the background they need to understand it, as well as places where the applications appear in physics, statistics, economics, and other disciplines. In addition, there are places on the paths following this article, further studies building on it. How can you represent the differences between these paths and help the student find a desired next step?

Talk is CEAP

As with the “Change We can Visualize” project, this is a case where we may find unlikely allies in hard-to-reach places. The example I’m thinking of is the support for the Big Three bailout from rival automakers in right-to-work states. They saw a non-obvious threat to their suppliers and helped to attenuate complaints from their elected representatives. As in the “Change We can Visualize” example, much of the relevant information may exist in narrative form. Certainly data sources are balkanized.

The project has already identified the following as critical information: workshop details, statistical snapshots, plant closure information, contact numbers and email ids of important people, whats- going- on- in- the- community- now) and identified a mixed audience, some of whom won’t get access to all the information. The group has mentioned the map metaphor as a way to organize this information and what I’d like to see is a number of examples of maps that might help. Check out How to Lie with Maps by Mark Monmonier (our library appears to have about 5 copies). Check out a “Techbeat” post about Apple’s Design Process in Business Week, posted by Helen Walters. In particular, consider the 10 to 3 to 1 guidance, where Apple comes up with 10 design mockups of any new feature before restricting themselves to three promising ones. Another guideline is to develop pixel perfect mockups to reduce ambiguity. A third is to pair a blue sky design meeting with a pragmatic design meeting, so that ideas are not stifled in the blue sky meeting and impractical ideas can be sidelined in the pragmatic meeting.

Web Visibility

There are two audience members, as far as I can tell, the Pure Visibility analyst and the client. These two are having a conversation about the visibility of the client’s website. They would like to have an artifact to refer to in that conversation. The conversation may be a kind of negotiation between these two interlocutors. The analyst’s contribution is more technical, while the client’s contribution is more domain-focused. The negotiation may be benefit from a representation like a SWOT diagram (a diagram with a good-bad axis and an internal-external axis, giving four quadrants named strength, weakness, opportunity, and threat), where items can be added and subtracted from each quadrant or even redefined and moved from one to another. Information objects may included competitors and words or clusters of words, as well as sites and clusters of sites.

Later, you will find yourself with information objects that the analyst and client may want to use in interaction, but for now, you may want to concentrate on identifying and positioning information objects that have some relevance to the decisions being made about the client websites.
Green Box

This project holds a special fascination for me as I have recently started to use widgets. I just acquired a computer (Raon Digital Everun Note) with a reputation for running at very high temperatures. Some users have posted concerns about the high temperatures to forums. They fear that the incessant high temperatures may be damaging to the device. Some are also questioning the environmental impact. As a result, I searched for widgets to tell me about the machine’s state, especially the temperature, but also other characteristics. I’ve always used processor monitor widgets on my Macs without consciously identifying them as such. It’s hard for me to imagine a computing environment without processor monitor widgets because I have been using them for so long. Now that I am looking for temperature monitors, I’m starting to appreciate widgets as a useful class of desktop objects and to see a lot of design opportunities and visualization opportunities for them.

Today I met a faculty candidate interested in using widgets to exchange collaborative information on the desktops of workgroup partners. He pointed out Plasmoids on KDE as an appropriate widget platform for putting things on the desktop.

Recently, I read about an iPhone app called Ocarina. A rave review is here. What fascinates me about this app is the way they’ve managed to get people to give up some privacy for some value. You might be able to get people to let your widget record, for instance, the websites they visit and give them an environmental score, assuming that some or much of their browsing is shopping related. How you secure that cooperation may be a measure of your talent.

649week05 Salary Performance Exercise - After Action Report

1 Comment

The salary performance exercise revealed numerous perspectives on information and information representation. I’d actually like you to continue your thinking along these lines next week, so it may be helpful to review some of the issues you raised. To do so, let’s look at the whiteboards, and in one case, the online document you used to explore the problem. I went back and took photos of the whiteboards to review them. My photos did not turn out very well, so I photoshopped them to highlight your marker strokes and soften the smudges on our whiteboards. I hope you don’t mind that this distorts the original work, and accept it in the spirit of spotlighting topics for discussion.

inverse-exploring-representations

The first thing most of you did was wrestle with the nature of the information. The concepts shown above include the notion of seasonality and cyclicality, size of icons as a proxy for quantities of money and symbology familiar to your audience for both time and money.

icon-piggy-banks

It seems that the basic entity we’d like to represent is the team. And the main attributes of that entity are its salary and performance. Salary, in this data, is a constant. Performance varies, so the ratio of salary to performance varies. The extremes of that ratio represented by the burning money and the logo piggy banks, whimsical icons that may fly through space (as a proxy for time).

icon-path

A simple, compelling way to represent a team’s wealth is by a circles sized in proportion to that wealth. This sketch suggests that animation and memory of a path may be enough? Do we need a trail? Can we just show these bubbles progressing from left to right with a slider? What would we make of trails, if any? I like this representation but bear in mind that, at any moment in the animation, only a narrow vertical slice is being used to actually show bubbles. The vast majority of the canvas is used to portray time. We’ll see a solution to this problem later.
discussion-sketch

This exercise was meant to prompt discussion of possible solutions. You can see that this sketch was the focal point of an active discussion in which a lot of ink was spilled. I consider that a sign of successful brainstorming. Here we see further development of the idea presented in the previous sketch: circles marching across a time field from left to right. This is a very frequent depiction of time. Again, as with the previous example, this devotes a lot of space to the time representation and leaves the possibility of trend lines as a record. A significant question with this approach is that the two lines shown may be much more legible than would lines for all teams appearing at once.
bar-chart

Here is one three space-saving approaches that do not use space to depict time. In this representation, little red balls bounce up and down showing shifts in performance as the background of bars showing salary remains constant. I can imagine refining this display by allowing the user to rearrange the order in which teams appear according to different criteria. It would also be interesting to think about letting the balls leave brief trails as they move or otherwise change. Finally, the bars representing money are muted here. Should they be more muted or more prominent? It seems to me that there are two conflicting desiderata. First, the money is important. Second, the money is constant. Those two characteristics suggests that maybe we want to find some function of money, such as bang / buck, that is not so conflicted and represent that instead.

fully-realized

This is medium shot of a very detailed, fully-realized poster. The students who made this decided on their representation very quickly and spent most of their time creating a very detailed depiction of what they had in mind. This reminds me of a recent Business Week column about Apple’s design process. The first ingredient described by Michael Lopp, senior engineering manager at Apple, was Pixel Perfect Mockups. He claimed that they remove all ambiguity. As a result, they uncover problems very far upstream and have little need to correct mistakes later in the process. There’s quite a lot you can say about this design at this stage. The main one, from my point of view, is that it manages positive and negative space very well. Space is a precious resource for any designer.
infoviz_baseball copy

Finally, Mike Harmala shared with us the further refinement he made of a space saving design after class. This one shows time explicitly with a very prominent arrow on the slider. It also appears to include ratio between money spent and performance. Position is used to show performance, so a given circle may move back and forth across the surface from one week to the next. Size is used to show raw money spent, so we have one way each to show money and performance independently, as well as one way to show them together.

649week05 Principles and the Normative Perspective

0 Comments

What are principles? Tufte tells us that we should do certain things. In Tufte (2006), he tells us that we should show comparisons, contrasts, differences. It’s his first principle of analytic design. In what sense is this a principle? How do we know that this is what we should do. I shared this “principle” with one of the most successful scholars I know, Judy Olson, and she replied that we should show what’s surprising. We should show differences where those differences are meaningful, but we should show similarities where those are meaningful. Some trends are surprising; some similarities are surprising.

It is not at all clear that these principles are grounded in any kind of theory. At the beginning of my other class, I offer a lengthy quote from Stephen Hawking about theory in A Brief History of Time on page 7.

A theory is a good theory if it satisfies two requirements. It must accurately describe a large class of observations on the basis of a model that contains only a few arbitrary elements, and it must make definite predictions about the results of future observations. For example, Aristotle believed Empedocles’s theory that everything was made out of four elements, earth, air, fire, and water. This was simple enough, but did not make any definite predictions. On the other hand, Newton’s theory of gravity was based on an even simpler model, in which bodies attracted each other with a force that was proportional to a quantity called their mass and inversely proportional to the square of the distance between them. Yet it predicts the motions of the sun, the moon, and the planets to a high degree of accuracy.

Note that Hawking is giving what we might call a normative definition, telling us what a theory should be. There might be plenty of things parading around under the theory banner that could be classed as bad theory. Note also that Hawking requires that a theory both describe and predict.

Why does theory have value? The theory of gravity can be used to decide what to do in a wide variety of situations. A theory of analytic design could be used to tell us about the causes and effects of different actions we take as designers. But what about the Principles of Analytic Design as articulated in Tufte (2006)? What are they good for? For example, the sixth principle tells us that content counts most of all. I sincerely hope that you don’t believe that and that you instead believe that content alone doesn’t count for much. There would be lines around the block at the Library of Congress otherwise. Clearly, I am not suggesting that you take Tufte’s principles at face value. So, why then, do I recommend reading his work?

Part of the answer can be found in Tufte (2003), where he discusses the Cognitive Style of Powerpoint. Tufte rails against a badly designed presentation tool that makes a lot of choices for you and makes most of the choices poorly. While railing, Tufte does what Tufte does best (and what you can also see on his blog): he demonstrates his ability to carry on a data-laced conversation. I believe that, to do this effectively, it helps to be exposed to exemplary work. For this reason, and this reason alone, I think it’s worthwhile to familiarize yourself with the normative perspective. Another way to familiarize yourself with this kind of work is to look at the Mathematica notebooks of our colleague Eytan Bakshy. In fact, let me use this opportunity to call upon Eytan to share a suitable example with us as another aid to our conversation about what you should and shouldn’t do.

649week05 The Normative Perspective

8 Comments

Normative, or prescriptive, ways of looking at infoviz guide us toward a narrow set of fashionable design choices. You may guess from the words “narrow” and “fashionable” in the preceding sentence that I have mixed feelings about this perspective on information visualization. This is certainly a valuable perspective but, before we explore it, I’d like to insert two placeholders into your thinking. The first of these has to with the position of infoviz in relation to other disciplines. The second has to do with users, contexts, and communities.

venn-visual

Above is a Venn diagram based on Table 1.1 of Card (1999), colored by ColorBrewer. Looked at this way, InfoViz may be said to inherit from several disciplines. There’s no reason why InfoViz shouldn’t borrow the principles of these disciplines as far as they offer design guidance. Plenty of guidance is available in the worlds of data graphics, information design, and external cognition. Consider this Venn diagram, an example of external cognition that has been available for about 125 years, with many refinements along the way. The coloring comes from a contemporary cartographer, Cynthia Brewer, and a data graphics aid called ColorBrewer. ColorBrewer asks you to consider whether the variables you’d like to color are sequential (which is what I chose), diverging (two groups with subgroups), or qualitative (nominal or categorical). It shows examples of color schemes reflecting these choices on a map of the Southeastern USA, as well as a variety of encodings such as CMYK and RGB to implement these color schemes. The colors were selected by an experimental process where the researchers displayed maps using various color schemes to participants who answered questions about the maps.

The ColorBrewer work is similar to that of William S. Cleveland, documented in his 1985 book, The Elements of Graphing Data, Hobart Press, Summit, NJ. Cleveland displayed graphics and asked questions about the quantities shown to experimental participants. In addition to his own work, he also collected studies conducted by his colleagues at Bell Labs. His books are responsible, for instance, for a worldwide aversion to pie charts. Why? Because he found that it was more difficult for people to estimate the quantities represented on pie charts than on other graphical artifacts.

This raises the second of the two issues I’d like to consider, but before we’ve finished with the first issue. Let’s just make a note of the fact that we have privileged one issue: accurate estimation of quantities represented on a graphic. It seems fair to do so. If all else is equal, we certainly prefer the technique leading to accurate estimation. Let’s return to this subject after we finish with the first issue.

The first issue with the normative perspective is that plenty of guidance is already available for exemplary practice. Several disciplines offer useful principles. They’ve been doing so for a long time. There are a few writings on graphical principles throughout history and a steady stream in the last hundred years. So what do the prescribers offer?

If we think about design as constrained choice, it’s easy to see many graphical aids as making choices for us. I use the term “graphical aid” loosely when I mention Powerpoint, but this is surely the ubiquitous contemporary “graphical aid”. It, more than any other system, insists that we violate graphical practice as old as Aristotle. The prescribers of InfoViz practice, above all else, rail against chartjunk, such as that created by default Powerpoint settings.

This is not new. Christopher Alexander, in Notes on the Synthesis of Form, 1964, sets forth a compelling example of Slovakian shawls. Their golden age as prized souvenirs in the nineteenth century ended with the introduction of a new technology, aniline dyes. Slovakian shawls, formerly prized as delicate and subtle, then became burdened by a reputation as vulgar and uninteresting. Alexander’s insight into the interruption of this art form was to see that the key gift of the shawlmakers was to be able to recognize a bad shawl in a group of good shawls. Individual bad shawls would appear in the tradition, but they would not be repeated and gradually a high standard developed and could be maintained. The tidal wave of available colors from aniline dyes simply overwhelmed this facility. We can imagine a normative approach to rescue the shawl-makers from new technology just as Tufte has rescued us from Powerpoint.

saturation

The second issue has to do with users, contexts, and communities. A recent normative book shows an example of good graphical practice as (a) using high saturation for objects and low saturation for backgrounds. The picture in (b) shows just how bad things are when we reverse these. The problem with this good / bad dichotomy can be seen in a painting by Miro (c) and by a stock photo of peppers. In both cases, the goal is to make the eye restless. In the normative view, tranquility is prized and we can’t simply say that one approach leads to tranquility without identifying that as good. The problem of users extends not only to different contexts, but to different users and to the social construction of the meaning of information artifacts.

649week05 Variables and Axes

2 Comments

Jasper Liu shared a learning experience with me that I’d like to pass on. You may recall that Jasper described some information for his Pure Visibility project and said he planned to depict it in three dimensions. He showed several pictures of labeled 3D axes, listing the information he planned to show as the axis labels. I objected to this on the basis that some of the information was nominal and some was ordinal (see Agresti, 2002), so their locations in three dimensions would introduce some ambiguity. Here’s a sample of the information, scores for terms in three search engines.

jasper1

Jasper’s response was to first plot some data to demonstrate what he had in mind. This is a crucial step. It’s often the case that our mental picture of a planned representation glosses over some practical limitations. When we draw a concrete picture and supply it with data we often see game-changing details. Such was the case for Jasper’s data. Here’s how it looked, plotted as 3D data.

jasper2

Jasper was dissatisfied with this picture and, significantly, looked at Card (1999), page 60, where the editors say “Static presentations often use retinal properties such as color to add an additional variable to a visual structure.” Reading this, he was apparently struck by the insight that he could use color to distinguish between search engines and greatly simplify the representation to the following picture. As we discuss the normative perspective, it might be helpful to look back at Jasper’s process.

jasper3

649week04 Changes in Prominent Tools

0 Comments

Zipdecode by Ben Fry (picture links to app's site)

Zipdecode by Ben Fry (picture links to app's site)

I claimed previously that InfoViz history spans two eras, a Big era and a Populist era. If that is so, the prominent tools should reflect this difference. In a way, it does. Prominent, talked-about tools have changed. Spotfire and Inxight are underrepresented today, for reasons that have to do with the ownership of the data they’re used to visualize, as well as because of the social relations of the people involved in their communities. These Big systems are not designed for growth in the userbase. Read the rest of this entry »

Tags: , , ,

649week04 Pictures of Prominent Tools

5 Comments

I mentioned Spotfire and Inxight, InfoViz tools that have enjoyed commercial success over many years and that were the product of an era when only those with access to Silicon Graphics workstations. InfoViz was exclusively the province of those who could forge partnerships between software engineers and cognitive psychologists.

http://www.inxightfedsys.com/pdfs/VizServer_Finalweb.pdf shows the three main information representations associated with Inxight, the hyperbolic browser, shown as StarTree in the brochure, Table Lens, and the perspective wall, shown as TimeWall in the brochure. Read the rest of this entry »

Tags: ,

649week04 Prominent Tools

12 Comments

I’ve shared my view that the history of InfoViz can be divided into two eras, the Big Viz era and the Populist Viz era.  This distinction helps me to think about which changes over time are most important and where social and technical aspects of change meet.  So I want to mention papers about the two most prominent tools from the Big Viz era still in commercial use today, Spotfire and Inxight, and I’d like to contrast them with some other tools more closely associated with the Populist Viz era, although some of these tools existed long before they were associated with InfoViz.

Two InfoViz Eras (picture links to 50MB presentation)

Two InfoViz Eras (picture links to 50MB presentation)

Read the rest of this entry »

Tags: , ,