/ blog

# Evaluation

How to tell if a visualization is working.

  1. assess the domain problem (do others agree with assumptions)
  2. determine the data and tasks (can people build the knowledge they need)
  3. choose the right encodings (can people see the patterns they need correctly)
  4. map data to encodings via algorithm (does the algorithm perform correctly)
  5. design interactions to explore data (can people quickly and intuitively interact with the data)

# Holistic Evaluation

# Insight-Based Evaluation

Quantify knowledge gained by a visualization, similar to ethnography.

# Experimental Evaluation

Run a controlled study to measure how quickly and accurately people can complete tasks by using different visualizations.

# Insight

The fundamental unit of measurement when evaluating data visualizations. Also the purpose of visualization.

Unit of discovery - what does the tool enable someone to do?

Two types:

Metrics for Insight-Based Evaluation:

# Qualitative Evaluation

Collect quotes, use cases, and anecdotes that help illustrate the effectiveness of your solution.

Use insight as key measure during these processes. Document as much as possible, including metadata about your participants.

# Systematic Surveys

# Semi structured interviews

# Think aloud Studies

# Journaling Studies

# Experimental Design

More precise than insights, but less about the domain. Measure how people complete a specific set of tasks under different conditions.

  1. Form a specific question
  2. Generate a set of falsifiable hypotheses
  3. Determine your independent (what you change) and dependent (what you measure) variables.
  4. Build your stimuli & experimental infrastructure (task framing, how to complete task, data collection, etc.)

# Analyze your data

Descriptive statistics:

Use measures of the data distribution to estimate whether there's differences between independent variables.

Inferential Tests:

Use statistical tests to estimate the likelihood that your observed differences reflect true difference.

# Evaluation Trade Offs

Qualitative:

Experimental

# Formative Evaluation

Test our understanding of the problem space and gather insight into the user's processes. Measure how well different designs optimize for a given set of tasks or goals.

Area Survey: what are the core tasks and needs of the problem Preference Mining: what design do people like

# Summative Qualitative Evaluation

Measure how well a given tool supports a domain. Provide a measure of performance against a target baseline.

Think aloud: what are users' impressions of a tool Horse race: which design is most efficient for a set of tasks popular vote: what design do people like best