HCI:
Empirical Evaluation

Mick McQuaid

2024-03-21

Week NINE

Today

  • Q and A from last time
  • Discussion leading ()
  • Design Critique (Swaraj?, Keyur)
  • Article Presentation ()
  • Break (break may be earlier or later in sequence)
  • Empirical Evaluation

Q and A from last time

Learning 1 of 4

Learning about Naheel Jawaid’s presentation was very interesting, particularly the insight that having all doors open could mean being last in line. Essentially, this would suggest that I must fully commit to a path to make significant progress. It resonates with the advice my dad consistently gave me about life, which I used to resist because I believed in keeping all options open. However, after hearing Naheel’s perspective, I now see the value in committing to a direction, though I still believe in maintaining some flexibility within reason.

Learning 2 of 4

I enjoyed the discussions about AirBNB vs Craigslist. I always thought that Craigslist was cluttered, and the design was a little archaic. I never thought that the clutter was an intentional design choice (so the user feels like they have a bargain).

Learning 3 of 4

I appreciate those two portfolio website examples which are really impressive. And thank you Rachel to teach me something I’ve always wanted to learn but didn’t figure out.

Learning 4 of 4

I thought the discussion on the different categorizations of prototype variants was interesting.

Q&A 1 of 5

N/A. Shout out to Rachel!! I’ve noticed that my ability to follow along with these Figma tutorials has improved significantly, and I’m not feeling as lost!

Q&A 2 of 5

Can you provide us with more examples of good portfolios?

Q&A 3 of 5

rauno.me/craft was really satisfying and fun to look at.

Q&A 4 of 5

When I use websites that look old, I would doubt if the content and system are under maintenance and updated. I’m curious to hear more about the good and bad experiences of these sites.

Q&A 5 of 5

I am curious about trying the other tools recommended by Naheel (the speaker who’s talk you summarized for us). It is nice to hear more about what designers are using and trying outside of Figma (since I feel Figma really boomed and it sounds rather ubiquitous, but there are other tools.)

Discussion

Experts of novel products

How do you find expert users for novel products that no one has seen before?

(Note re iPhone: Android team members I talked to said that, as soon as they saw it, they knew they had to scrap their existing work)

Surprising behavior

Which empirical evaluation techniques are most likely to lead to surprising behavior and possible design pivots?

Qual vs Quant

Is qualitative data more valuable than quantitative data? If so, why? Is it possible that it depends on context?

(Note: I just had a horrendous experience with the KLM website over break and yet they tried to survey me every step of the way.)

How is data collection accomplished?

Do you partner with data analysis experts or learn all the skills? How do you track and store observations, time on task, user comments, and more?

Benchmark tasks

How do you select benchmark tasks?

(Note: Isn’t this largely dictated by the goals of the system being tested?)

Collaboration between Design and Engineering

How can UX and engineering teams establish a collaborative and integrated approach?

Empirical evaluation

Spectrum of measurement

I claim that there is a gray area between objective and subjective and that it’s a spectrum from objective to subjective measurements. Do you believe that?

I also claim that there is a gray area between quantitative and qualitative. What do you think?

Scales

  • Ratio: can say this is twice as much as that, e.g., money
  • Interval: can say this is a certain amount more than that, e.g., temperature
  • Ordinal: can rank, can say this is more than that but not how much, e.g., competitors in a dance contest
  • Nominal: can say this differs from that, e.g., gender

Formative evaluation

  • conducted while in process
  • conducted to refine

Summative evaluation

  • conducted after process
  • conducted to determine final fitness
  • usually only done in big software contracts, often after a waterfall process

Empirical vs analytic evaluation

  • real users vs experts
  • observation vs automated checks

Planning

  • Decide a priori what you plan to evaluate and establish measures in advance
  • Consider new users, experts, consequences of errors, sources of satisfaction

Whitney Quesenbery posits 5 Es

  • Effective: How completely and accurately the work or experience is completed or goals reached
  • Efficient: How quickly this work can be completed
  • Engaging: How well the interface draws the user into the interaction and how pleasant and satisfying it is to use
  • Error Tolerant: How well the product prevents errors and can help the user recover from mistakes that do occur
  • Easy to Learn: How well the product supports both the initial orientation and continued learning throughout the complete lifetime of use

Five E techniques (1 of 3)

  • Effective: Watch for the results of each task, and see how often they are done accurately and completely. Look for problems like information that is skipped or mistakes that are made by several users.
  • Efficient: Time users as they work to see how long each task takes to complete. Look for places where the screen layout or navigation make the work harder than it needs to be.

Five E techniques (2 of 3)

  • Engaging: Watch for signs that the screens are confusing, or difficult to read. Look for places where the interface fails to draw the users into their tasks. Ask questions after the test to see how well they liked the product and listen for things that kept them from being satisfied with the experience
  • Error Tolerant: Create a test in which mistakes are likely to happen, and see how well users can recover from problems and how helpful the product is. Count the number of times users see error messages and how they could be prevented.

Five E techniques (3 of 3)

  • Easy to Learn: Control how much instruction is given to the test participants, or ask experienced users to try especially difficult, complex or rarely-used tasks. Look for places where the on-screen text or work flow helps…or confuses the

UX target table

  • Work role: user class
  • UX goal
  • UX measure (what is measured)
  • Measuring instrument
  • UX metric (how it is measured)
  • Baseline level
  • Target level
  • Observed results

Steve Krug’s approach

(pause for video)

Readings

Readings last week included Hartson and Pyla (2019): Ch 20

Readings this week include Hartson and Pyla (2019): Ch 22–24

Assignments

None

References

Hartson, Rex, and Pardha Pyla. 2019. The UX Book, 2nd Edition. Cambridge, MA: Morgan Kaufman.

END

Colophon

This slideshow was produced using quarto

Fonts are League Gothic and Lato