Heuristics and their role in design evaluation

Many of us try to drink water during the day. We’ve been taught that adults should drink about two litres of water a day to promote general health. But do you know where this “rule of thumb” originated? More to the point, do you need to know the full story behind this widely used rule of thumb in order to benefit from it?

The term heuristic is a fancy way of saying “rule of thumb.” Heuristics simplify subject expertise in a way that it can be used by someone who isn’t an expert. Heuristics can speed up decision making, help assess a situation, or identify flaws in a system. So, even if you’re not a biologist, medical doctor, or dietician, you can still benefit from the wisdom of drinking water everyday.

Design heuristics serve the same purpose: they are shorthand for the best usability principles and guidelines we use when designing everything from consumer products to websites. Groups of design heuristics are used as a sort of checklist to identify the strengths and weaknesses of a product experience; this activity is called a heuristic evaluation. You can tailor the set of heuristics used during an evaluation to reflect what you’re hoping to learn, what your product principles are, or what you’re measuring.

For example, a fairly standard heuristic for software applications is to make help available to the user. If the product you’re evaluating doesn’t provide a help system to the user (e.g., tooltips, clear error dialogs, manual), it ranks poorly for that heuristic. By looking at how well — or poorly — the product conforms to each of the heuristics used during an evaluation, we capture a snapshot of its usability.

Heuristic sets

Researchers have developed many heuristic sets. Some have been created within industry, but many have come from academia. This makes sense: Which researcher wouldn’t want a tool to help quantify software usability, something which can often be tricky to do well?

Jakob Nielsen is a usability expert with an academic background in human-computer interaction, interaction design, and usability testing. He is known for many things, including the heuristic set he created in the early 1990s. His “10 Usability Heuristics for User Interface Design” includes “the system should always keep users informed about what is going on, through appropriate feedback within reasonable time”, and the system should “help users recognize, diagnose, and recover from errors”.

Jakob Nielsen's 10 Usability Heuristics for User Interface Design

Visibility of system status. Keep users informed through appropriate feedback, within reasonable time.
Match between system and real world. Use language and concepts borrowed from the user's world.
User control and freedom. Provide a clearly marked way to get out of mistakes, such as undo and redo.
Consistency and standards. Don't make the user guess what terms, actions, or situations mean: follow platform standards.
Error prevention. Good error messages are good, but recognizing error-prone conditions and asking users to confirm their actions is better.
Recognition rather than recall. Minimize the user's memory load by making key objects, actions, and options visible. Place important information where the user will need it.
Flexibility and efficiency of use. The system should cater to both inexperienced and experienced users. Allow users to tailor frequent actions.
Aesthetic and minimalist design. Don't show irrelevant information as it competes with the useful information and makes them harder to see.
Help users recognize, diagnose, and recover from errors. Error messages should be in plain language, indicate the problem, and suggest a solution.
Help and documentation. Help systems should be easy to search, focused on the user's tasks, provide a list of concrete steps, and not be too large.

Another important heuristic set is Ben Shneiderman’s “Eight Golden Rules of Interface Design”.

Ben Shneiderman's Eight Golden Rules of Interface Design

Strive for consistency.
Seek universal usability.
Offer informative feedback.
Design dialogs to yield closure.
Prevent errors.
Permit easy reversal of actions.
Keep users in control.
Reduce short-term memory load.

You may notice that there is some overlap between these two sets. But, as digital products evolve, so too does the maturity and scope of heuristics. For example, in Designing Effective Voice Interfaces, Weinschenk and Barker describe a set of heuristics tailored to the usability of voice interfaces.

Heuristic evaluations

Heuristic evaluations characterize the usability of software products. This is, by itself, very useful. They become more valuable strategically when:

performed over time to help measure improvements or regressions in usability
applied to other products in the market to critically and more objectively look at the competition and highlight market differentiation

It’s worth mentioning two limitations of heuristic evaluations. First, an evaluation identifies problems with an interface — not solutions. Of course, knowing a problem exists and a bit about its nature is a good start.

Second, don’t consider the results of an evaluation in isolation. Data triangulation is key here: your confidence will grow if data from multiple sources say the same thing. If your conclusions from recent user research don’t agree with those from the latest heuristic evaluation, you need to do more research.

Even with those limitations, heuristic evaluations are a valuable research tool. Heuristics represent generalized knowledge, and they can often be applied effectively “as is” to different types of interfaces. This means you can use the same heuristics to evaluate a website, a mobile app, or a smart TV menu. They can be performed as a standalone exercise or as part of a more comprehensive UX walkthrough.

Over time, those who participate in heuristic evaluations learn what makes an interface usable and this knowledge becomes part of their design and development process. While usability experts can always contribute to a richer assessment of an interface, with some practice and a strong evaluation guide, almost anyone can run a useful heuristic evaluation.

Planning an evaluation

How do you decide which heuristics to use as part of an evaluation? Certainly don’t cherry pick those heuristics that will favour your product’s usability! You may not know where a product’s usability issues lie, so it makes sense to start with a broad set of heuristics such as those proposed by experts like Nielsen and Shneiderman.

Where there is room for customization is the target or scope of the evaluation itself. Do you need to look at the entire application or just one workflow or feature? Are you interested in the differences between two competitive workflows or products? When planning an evaluation, discuss and document its scope clearly: state what will and won’t be tested. Not only will this help estimate the time required, it’ll set expectations for stakeholders.

Next, decide who will perform the heuristic evaluation. Usability experts (a UX designer, for example) can provide rich detail about their assessment but anyone involved in product development will likely have a strong sense for usability and can help out. Product managers, developers, and quality assurance folks are all candidates to help with an evaluation.

If you have the resources, having four people independently conduct the same evaluation is ideal (in the early 1990s Jakob Nielsen and Tom Landauer found four or five testers represented the best value). Having multiple evaluators reinforces the idea of data triangulation: Did multiple evaluations produce the same result? If not, why not? And while it’s hard to remove bias entirely, having more than one evaluator will help. If you can, try to find an evaluator who hasn’t contributed to the interface they’ll be testing, either during design or development.

Running an evaluation

Heuristic evaluations are not unlike testing someone for their driver’s license. Imagine sitting in the passenger seat of a car, clipboard in hand, ready to evaluate a new driver. In this case, the goal is to determine the strengths and weaknesses of the driver as they perform some tasks, using observation and a standard rubric.

In the same way, when you run a heuristic evaluation, you’ll:

determine its scope (which aspects are we testing),
define the rubric (the evaluation guide describing which heuristics are being used),
come up with a standard way of reporting the results (usually a form with space for a score, comments, and diagrams)

Remember, the goal of an evaluation is not to come up with solutions. Instead, the evaluation highlights gaps in a product’s usability to the team so they can respond using whatever business process they want.

Your heuristic evaluation results should include a breakdown of the infractions and their severity. Relate each infraction to a specific heuristic in the evaluation set, and include a description of the issue and a screen shot or video.

Heuristics	Infractions
H1. Visibility of system status	3
H2. Match between system and real world	1
H3. User control and freedom	3
H4. Consistency and standards	0
H5. Error prevention	5
H6. Recognition rather than recall	7
H7. Flexibility and efficiency of use	2
H8. Aesthetic and minimalist design	1
H9. Help users recognize, diagnose, and recover from errors	0
H10. Help and documentation	2
Total:	24

Severity	Infractions
0: Not a usability problem	0
1: Cosmetic problems	3
2: Minor usability problem	6
3: Major usability problem	12
4: Critical usability problem	3
Total:	24

Over time, these results can be used to establish baselines (against which future development is judged), contribute to competitive analysis, increase the capacity of the team for design and usability, and generally improve the product.