Info & Interaction Design: Information

Mick McQuaid




Information Overload

  • An old concept
  • Passing the basketball illustrates attentional blindness
  • London map experiment illustrates the same principle
  • Terms like cognitive overload, change blindness are related

Information itself

  • Information theory (Shannon (1948), Cover and Thomas (2006))
  • Information architecture (Rosenfeld, Morville, and Arango (2015))
  • Information Design (Redish (2014))

Unstructured information

  • Term “unstructured” may be misleading
  • There is structure but it follows no strict rules
  • Examples include news articles and blog posts
  • “Unstructured” is a matter of degree and is best understood by comparing it to semistructured and structured information

Semistructured information

  • Labeled information, e.g., forms filled out by people
  • There are rules, but it’s often easy to break them
  • Hence, human intervention is required to deal with rule breakage

Structured information

  • Obeys strict rules
  • Can be processed rapidly in large volumes
  • Can be easily aggregated to tell, for instance, how many orange shirts size L were ordered on game days in the 2021 season
  • Often passed from one computer program to another
  • Uses techniques to diminish the effect of human error, such as bar code readers
  • Usually presented as relations (tables with rows and columns) or hierarchies (trees)

Relational data

  • By far the most prevalent within organizations
  • Tables of rows (entities) and columns (attributes)
  • Tables are linked to each other
  • Good tables are long (many rows) and thin (few columns)

Hierarchical data

  • Second most common within organizations and most common between organizations
  • Example: Fedex international waybill

So an example hierarchy might look like this

   Sender Name
   Sender Address
   Sender Account Number
   Recipient Name
   Recipient Address
   Commodity 1
     Unit / Number
   Commodity 2
     Unit / Number

Big data

  • Usually means data too big to be processed on a single computer, instead requiring a cluster
  • Sometimes means VVVV, where the Vs stand for Volume, Velocity, Variety, Variability (sometimes adding an extra V for more bigness)
  • Google’s breakthrough in big data on cheap computers became a generic term, MapReduce


  • Richard Dawkins coined the term in 1976
  • Dawkins thought of memes as patterns that encourage proliferation

Quote from The Selfish Gene (1976)

The new soup is the soup of human culture. We need a name for the new replicator, a noun that conveys the idea of a unit of cultural transmission, or a unit of imitation. ‘Mimeme’ comes from a suitable Greek root, but I want a monosyllable that sounds a bit like ‘gene’. I hope my classicist friends will forgive me if I abbreviate mimeme to meme. If it is any consolation, it could alternatively be thought of as being related to ‘memory’, or to the French word même. It should be pronounced to rhyme with ‘cream’.

Organizing information

Designers use an almost folkloric understanding of how people organize information to design information artifacts to work with rather than against people. Some of the borrowings from other disciplines studying the organization of information include the following.


We can group information together under labels or without labels. The latter activity is usually called clustering while the former is often called categorization. If we have labels, the question arises as to where the labels come from and who gets to identify them. Famously, Melvil Dewey reserved many labels in his library classification system for items familiar to him and European men like him, but few labels for items that were familiar to the vast majority of humans.

Card sorting

Card sorting is a common way to elicit labels. You can give a person a set of cards with terms written on them and ask the person to sort the cards into piles of similar terms. Then ask them to name the piles. This is typically called an open card sort, described by Spencer (2009). An alternative might be to provide a set of category cards in addition to the content cards and ask a person to place the content cards adjacent to the appropriate category card. This exemplifies a closed card sort, described by Spencer (2009). There are many variations of card sorts and an extensive literature on using them to label concepts.

Interrater reliability

After you conduct a card sort, how do you evaluate your results? If you’ve recorded several people sorting the same cards, you can measure interrater reliability, using Cohen’s \(\kappa\) (pronounced Kappa).

Wikipedia definition of Cohen’s \(\kappa\)

\[\kappa \equiv \frac{p_o-p_e}{1-p_e} = 1 - \frac{1-p_o}{1-p_e}\]

where \(p_o\) is the proportion of observed agreement between raters (same as accuracy, defined as the number of agreed items divided by the total number of items), and

\[p_e= \frac{1}{N^2}\sum_k n_{k1}n_{k2}\]

for \(k\) categories and \(N\) items and the number of times rater \(i\) predicted category \(k\): \(n_{ki}\).

More on Cohen’s \(\kappa\)

Cohen’s original 1960 article also defines \(\kappa\) in terms of frequencies of observed agreement, \(f_o\), and agreement expected by chance, \(f_c\):

\[\kappa = \frac{f_o-f_c}{N-f_c}\]

Why should you use this measure of interrater reliability? The problem is that people will agree to some extent by chance. You should try to account for chance agreement in a measure of agreement.

synthetic example in R

Suppose you have two raters and twenty items. Each item can be rated as 0 or 1. You can simulate this easily with random binomial draws as follows.

theta <- 0.5
N <- 20
rater1 <- rbinom(n = N, size = 1, prob = theta)
rater2 <- rbinom(n = N, size = 1, prob = theta)
twentyitems <- cbind(rater1, rater2)

The output of twentyitems is as follows in one example run. Each run will differ because of the random number generation.

Sample run

      rater1 rater2
 [1,]      0      0
 [2,]      0      0
 [3,]      1      0
 [4,]      1      0
 [5,]      0      0
 [6,]      1      1
 [7,]      1      1
 [8,]      1      0
 [9,]      0      1
[10,]      0      1
[11,]      1      0
[12,]      1      1
[13,]      0      0
[14,]      1      0
[15,]      0      0
[16,]      0      1
[17,]      0      0
[18,]      1      0
[19,]      0      0
[20,]      0      0

Using the irr package

if (!require(irr)) {


The output of agree() is as follows.

 Percentage agreement (Tolerance=0)

 Subjects = 20
   Raters = 2
  %-agree = 55


The tolerance=0 parameter says that you don’t allow similar scores to be interpreted as the same. For example, suppose instead of 0 or 1, the raters could choose any integer from 0 to 100. You might want the difference between 50 and 52 to be interpreted differently than the difference between 10 and 90. You might even say that they agree if their scores are 50 and 52. The tolerance parameter allows you to tune for this.

Calculating \(\kappa\)

Now calculate Cohen’s \(\kappa\) to adjust for the possibility of chance agreement.


 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 20
   Raters = 2
    Kappa = 0.0625

        z = 0.294
  p-value = 0.769

Open coding and closed coding

  • Difference is whether researcher supplies the labels or participants do
  • Open coding involves an extra step to account for synonyms

Example of closed card sort

Now suppose you have five piles of cards in a closed card sort. I can simulate this with random numbers uniformly distributed from 1 to 5.

N <- 5
rater1 <- round(runif(n = N, min = 1, max = 5))
rater2 <- round(runif(n = N, min = 1, max = 5))
rater3 <- round(runif(n = N, min = 1, max = 5))
rater4 <- round(runif(n = N, min = 1, max = 5))
fiveitems <- cbind(rater1, rater2, rater3, rater4)


The output of fiveitems is as follows. Note that your results will differ with the same code because of random number generation.

     rater1 rater2 rater3 rater4
[1,]      2      5      1      2
[2,]      4      2      3      5
[3,]      3      4      3      4
[4,]      4      2      3      4
[5,]      3      4      1      3

\(\kappa\) with more than two raters

You can’t use the exact same function as with two raters but there are a couple of other \(\kappa\) functions for multiple raters.



The output for agree(fiveitems) is

 Percentage agreement (Tolerance=0)

 Subjects = 5
   Raters = 4
  %-agree = 0

The output for kappam.fleiss(fiveitems) is

 Fleiss' Kappa for m Raters

 Subjects = 5
   Raters = 4
    Kappa = -0.0965

        z = -0.975
  p-value = 0.329

The output for kappam.light(fiveitems) is

 Light's Kappa for m Raters

 Subjects = 5
   Raters = 4
    Kappa = -0.0286

        z = NaN
  p-value = NaN
Warning message:
In sqrt(varkappa) : NaNs produced

Comments on output

When I run these, they give slightly different results. I’m not sure how much that matters but my guess is not much. You should report which function you used. One nice thing about Fleiss’s \(\kappa\) is that it allows missing values. That is to say that, if you have some piles that differ from others, you could leave part of a row blank if only some raters have a score for it.

Light’s \(\kappa\) is briefly described in Hallgren (2012). That tutorial gives R code for several versions of \(\kappa\), including Cohen’s weighted \(\kappa\), described above as kappa2(), and both Fleiss’s and Light’s \(\kappa\).

Grouping information items

  • Classical approach uses similar attributes
  • Prototype approach uses comparison to a specific thing

Flat and hierarchical grouping

  • Piling vs Filing again!
  • Hierarchies require each thing to be in one location
  • Tagging allows each thing to be in multiple locations

Automatic and supervised grouping

  • Labels require an authority to say which things belong to which label
  • Authority is usually a person or group of people
  • Grouping without labels can be done by a machine
  • Training and testing sets are employed by machines as a form of supervision

Human computation

  • Relatively new field in computer science
  • Asks people to do what machines can’t
  • Gameifies tasks, such as identifying craters on the moon
  • Origin of captchas

Networked information

  • Network may mean connections between computers
  • Network may be used in the sense of network science, which studies all networked phenomena, such as social networks

Supply chain information

  • Information flows between productive enterprises
  • Usually have a channel captain who dictates information format
  • Typically use XML
  • XML can be read by humans but is inefficient for internal processing

Connecting Concepts

The following concepts: card sort, monitoring navigation, monitoring social networks, and flexibility of information representation, all come together to give us tools to build information containers. Let us briefly review them.

Understanding labels

We discussed card sort as a means of understanding the labels people use to describe things of interest. We considered the issue of cognitive dissonance in the labeling of information containers and card sorting as a means to overcome it. We can connect the concepts of information hiding and labeling to see how labeling helps to limit information overload.

Learning navigational behavior

  • Google learns from user navigational behavior
  • Search behavior is bartered by information brokers
  • LSOs (locally stored objects) are often used instead of cookies, as are tracking pixels
  • A-B testing is the most common tool for understanding navigational behavior

Social network influence on navigation

All commercial interests have recognized the significance of social networks and have devised ways to exploit social networks to influence navigation. Many navigational features in common use today are the result of specialists in a new field called network science drawing together research in many fields to understand human behavior and influence in networks. They use terms like betweenness centrality and network closeness. Major figures in the field include M.E.J. Newman, Stanley Wasserman, Albert-László Barabási, Duncan Watts, and Lada Adamic.

Information structure

We have extensively discussed how and whether information is structured, using as a principle the degree to which human intervention is required to process information. We have discussed hierarchical and relational ways of organizing and storing information.

Flexibility of representation

We have touched on the notion that information structures are more or less amenable to change. Brittle structures may be symptomatic of technical shortcomings or may be symptoms of authoritarian governance. We discussed whether the speed with which we can modify an information artifact matters in a given context. We saw that we may put together an information artifact with little planning if we expect to take advantage of user behavior to improve it but that, if we can can not or will not change an information artifact after publishing it, we can not realize the value of understanding navigational behavior.

Information Design Patterns

We discussed several elements that information architecture authors have referred to as information design patterns. To determine whether these elements deserve the label of design pattern, we must examine the coinage and past use of the term.

Design Pattern Definitions

The term design pattern is popularly used in many ways. Popular usage leads to an abbreviation of the original usage that may lose some of the original essence. Following are a few popular borderline uses of the term that barely work.

  • A general reusable solution to a commonly occurring problem within a given context
  • A description (or template) for how to solve a problem, it can be used in many different situations
  • An enumeration of the consequences of the use of the pattern in a given context
  • Patterns are formalized best practices

Design patterns were first observed

Design patterns originated as a architectural concept by Alexander (1977). Alexander examined architecture from the standpoint of its value to a community of people in daily life. Alexander’s ideas were largely ignored or rejected by architects but soon gained a cult following among computer scientists. Eventually his books became so popular outside architecture that they began to influence architecture.

A Pattern Language

Alexander describes it as a structured method of describing good design practices within a field of expertise. Interview The term was coined by Christopher Alexander and popularized by his book A Pattern Language. This book was followed by another book intended to explain the first book. Alexander has continued to try to explain the concept to this day.

Components of a Pattern Language

The Syntax describes where the solution fits into the larger design. The Grammar describes how the solution solves the problem. For example, “Balconies and porches which are less than 6 feet deep are hardly ever used.”

An example of a pattern is a place to wait

The problem is that the process of waiting has inherent conflicts in it. The solution: In places where people end up waiting (for a bus, for an appointment, for a plane), create a situation which makes the waiting positive.

An example of a pattern is a useful cooking layout

The problem is that cooking is uncomfortable if the kitchen counter is too short and also if it is too long. Solution: To strike the balance between the kitchen which is too small, and the kitchen which is too spread out, place the stove, sink, and food storage and counter in such a way that:

  1. No two of the four are more than 10 feet apart.
  2. The total length of the counter—excluding sink, stove, and refrigerator—is at least 12 feet.
  3. No one section of the counter is less than 4 feet long.

Computer scientists popularized design patterns

The Gang of Four (commonly abbreviated GoF) were among computer scientists seeking a basis to make code less arcane, more scientific and, above all, reusable.

One aspect of Alexander’s description was so general that it seemed applicable to any field in which design plays a role. This key aspect was the notion of a quality that could not be named but that could be understood through experience—the quality shared by successful designs. Specific and non-obvious combinations of characteristics could support this quality.

Gang of Four book

Gamma et al. (1994) exploded on the software scene and propelled Alexander to greater fame at the same time as solidifying Object Orientation’s place in mainstream software development.

The GoF argue that great writers use patterns, e.g., all of Shakespeare’s plays were based on earlier, less successful plays or stories. The GoF refer to tragically flawed hero or boy-meets-girl, boy-loses-girl as patterns with infinite variety The GoF book serves two purposes, to tell what patterns are and to catalog 23 well-known patterns.

Gang of Four pattern definition

A design pattern is a description of communicating objects and classes customized to solve a general design problem in a particular context. (from the introduction to Design Patterns, 1994)

A pattern has four things

  1. Pattern name
  2. Problem
  3. Solution
  4. Consequences

A pattern name is a tool

The pattern name must be good enough to become part of the design vocabulary. The pattern must be useful in conversation, documentation, and thinking. The GoF spent a lot of its time on the names of the 23 patterns in the catalog.

A problem may be of several kinds

The first kind includes basic design problems such as algorithm design. Another kind includes commonly occurring classes or object structures known to be problematic. A third kind includes lists of conditions that, if they occur together, create a generic problem.

A solution is a description of objects and classes

It is not a solution in a packaged sense. A solution is abstract, not implementation specific. A solution is a description of the elements of the solution (objects and classes). The description must identify

  • Relationships between elements
  • Responsibilities of elements
  • Collaborations between elements

A consequence is a result or trade-off

The application of a pattern may resolve conflicts of various kinds, most often conflicts of space and time. To contemplate the use of a design pattern is to evaluate the design decision with awareness of the consequences. Consequences may have implementation issues, unlike the solution. If you feel tempted to talk about implementation, do so under the consequences banner instead of under the solution banner. Keep the solution a description, not an evaluation of itself.

Design patterns in information architecture

  • Not (in my view) real design patterns
  • They are anything commonly used on the web, such as “tabbed menus”, “hierarchies”, or “hub-and-spoke” structures
  • They would need elements from Alexander’s original concept to elevate them to the level of design patterns

Magic fairy dust

A pioneer of HCI, Gary Olson, once told me that his main interdisciplinary frustration was that practitioners in other fields often wanted him to sprinkle magic fairy dust on their products. As HCI practitioners, you must avoid the complementary trap: don’t mistake the work of Alexander and the GoF as magic fairy dust that can be sprinkled on your information architecture.

What would a real design pattern look like?

  • Nick Belkin (personal communication) gave the example of the Warburg Institute
  • It contains a discovery system organized around the activities of scholars interested in the Renaissance
  • It permits serendipitous discoveries by placing items close together if they are likely to be used together
  • It’s a few blocks from the world’s largest library, yet scholars use it

Alexander’s point applied to digital artifacts

  • There is this quality that cannot be named that characterizes what communities find useful and enduring
  • Our job as designers is to find the context of use by the community and organize according to it
  • What is enduring on the web or in apps? Why? (Much is forced by monopolies, but some is driven by community)


Alexander, Christopher. 1977. A Pattern Language: Towns, Buildings, Construction. New York, NY: Oxford University Press.
Cover, Thomas M, and Joy A Thomas. 2006. Elements of Information Theory. Hoboken, NJ: John Wiley & Sons.
Dawkins, Richard. 1976. The Selfish Gene. New York, NY: Oxford University Press.
Gamma, Erich, Richard Helm, Ralph Johnson, and John Vlissides. 1994. Design Patterns: Elements of Reusable Object-Oriented Software. Upper Saddle River, NJ: Addison-Wesley Professional.
Hallgren, Kevin A. 2012. “Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial.” Tutor Quant Methods Psychol 8 (1): 23–34.
Redish, Janice. 2014. Letting Go of the Words: Writing Web Content That Works. 2nd ed. Waltham, MA: Morgan Kaufman.
Rosenfeld, Louis, Peter Morville, and Jorge Arango. 2015. Information Architecture: For the Web and Beyond. 4th ed. Sebastopol, CA: O’Reilly Media.
Shannon, Claude E. 1948. “A Mathematical Theory of Communication.” Bell System Technical Journal 27 (3): 379–423.
Spencer, Donna. 2009. Card Sorting: Designing Usable Categories. Brooklyn, NY: Rosenfeld Media.



This slideshow was produced using quarto

Fonts are League Gothic and Lato