For improved ROI, conduct heuristic evaluations prior to usability testing

Catch low-hanging fruit with heuristics so that users can reveal deeper insights in usability tests

Matthew Garvin (高价会)
UX Collective

--

UUser experience research tends to break down into two broad categories, field studies and usability testing. Or, we might refer to this as need assessment and usability evaluation. Either way, heuristic evaluations will fall under the umbrella of usability methods. This method was invented by Nielsen and Molich (1990) and popularized as a means of discount usability evaluation, aimed at software startups that didn’t have the budget for real user research. Today, user research is more common, and usability testing is the gold standard. If you want to maximize your return on investment (ROI) for usability testing, you’ll want to perform a heuristic evaluation first. This article will explain what a heuristic evaluation is, how to do one, the pros and cons of this method, and why you should do it in lieu of usability testing to maximize the return on investment for both.

In Nielsen’s own words:

Jakob Nielsen looking thoughtfully into the camera
Jakob Nielsen

“Heuristic Evaluation is a usability engineering method for finding the usability problems in a user interface design so that they can be attended to as part of an iterative design process. Heuristic evaluation involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles (the “heuristics”). ~ Jakob Nielsen,

Defining ‘heuristic’

With that, let us simply define a heuristic as a usability principle or “rule of thumb”. Although when we refer to heuristics in terms of UX (rather than AI) we are talking about usability, a designer could theoretically employ the same process to judge a product’s compliance with the design system.

As an example, let us say you have an app that was designed without a system in place. Now your company is using a system based on Material Design. You go to the Material website and create a list of their guidelines with which to judge your UI’s compliance. Those guidelines can serve as your “heuristics”, at least in terms of the design. Remember, the heuristics we are talking about are for usability.

Nielsen developed his heuristics in the early ’90s, distilling a list of nearly 300 known usability issues down to 10 overarching principles. And although they are still widely used today, many user researchers are beginning to develop their own heuristics that are more focused on modern technology and the issues related to it. We didn’t have the powerful mobile and smart technology back then that we take for granted today. The computing technology we did have wasn’t widespread and generalized enough for software companies to care about accessibility issues.

Nowadays, we have a variety of heuristic sets to choose from. For information on some of the more popular sets, refer to Norbi Gaal’s article, “Heuristic Analysis in the design process”.

In addition to the sets referenced by Norbi, there are a few other specialized sets worth noting here:

Developing heuristics

While developing your own heuristics may be encouraged, care must be taken when selecting appropriate principles. This is where prior user research can inform what heuristics are selected. What are their needs, preferences, pain points that you are trying to support and provide solutions to? Furthermore, and perhaps most importantly, you will want to pilot your heuristics in the same fashion as you would pilot your interviews, surveys, and usability tests.

Quiñones et al. (2017), describes a methodology for developing heuristics. This is an eight-step process through which researchers will:

  1. Explore: Perform a literature review.
  2. Experiment: Analyze data from different experiments to collect additional information.
  3. Describe: Select and prioritize the most important topics revealed from 1–2.
  4. Correlate: Match the features of the specific domain with the usability/UX attributes and existing heuristics.
  5. Select: Keep, adapt, create, and eliminate heuristics obtained from 1–4.
  6. Specify: Formally specify the new set of heuristics.
  7. Validate: Validate the heuristics through experimentation in terms of effectiveness and efficiency in evaluating the specific application.
  8. Refine: Refine and improve the new heuristics based on feedback from 7.

As you can imagine, this process isn’t a quick and dirty means of getting feedback, rather it’s an entire project in itself.

The Evaluation Process

A heuristic evaluation is what is referred to as an expert review. As with other expert reviews, a heuristic evaluation is intended to be a quick and dirty method to uncover issues cheaper than usability testing in terms of both time and money. If you’re not going through the process of developing a new set of heuristics as outlined above, the entire HE process should only take about a week, with the actual evaluation taking no more than a day or two. Instead of recruiting users to put your design in front of, you recruit 3–5 evaluators to review your design according to the chosen heuristics.

The heuristic evaluation process
  • Familiarize — If you have multiple evaluators (as you should!) then you are going to want them to devote some time familiarizing themselves with the heuristics you plan to use to conduct the evaluation. This is particularly crucial if you are also expecting them to validate a new set of heuristics.
  • Evaluate — There are a few parts to this stage.
1. First, and let’s be clear: Your evaluators do not have intimate knowledge of your product. You should not be recruiting people who make design/implementation decisions on this product.2. The evaluators got familiar with the heuristics, now let them familiarize themselves with the product. They should spend an hour or two navigating, clicking/tapping buttons, and understanding the basic patterns and flows the user experiences.3. Heuristic evaluations are typically conducted in two passes. Each pass should be anywhere from 1–3 hours. In the first pass, evaluators holistically interact with the product and note any heuristic violations. In the second pass, evaluators do it all over again. They also retrace their steps and consider if any violations from the first pass are false alarms.
  • Rate Severity — This step doesn’t have to be done on its own. Often evaluators will rate the severity at the same time they are noting the violation. They may go back on the second pass and change the severity ratings of previously noted violations. A standard rating scale comes from Jakob Nielsen, and looks like:
0: I don’t agree that this is a usability problem at all1: Cosmetic problem — quick fix or ignore unless there’s time2: Minor usability problem — low priority3: Major usability problem — high priority4: Usability catastrophe — must be fixed before release
  • Synthesize and Prioritize Findings — At this stage, the evaluation is complete, and the analysis can begin. The evaluators come together and discuss their findings. Evaluators will create an aggregate list of all noted violations, discuss and identify potential false alarms, and agree upon severity scoring. If they are validating new heuristics, this is also the point at which they will be doing so.
  • Converge on Design Recommendations — Based on a review of the prioritized findings, the evaluators will then brainstorm and converge on recommendations to solve the usability issues uncovered in the heuristic evaluation.

Why 3–5 evaluators

Depending on your particular circumstances and the given experience of the evaluators you have at your disposal, it may be possible to produce significant findings from a single evaluator. However, there are a few reasons for having multiple evaluators. Nielsen found through his own research on the method that single evaluators will only uncover about 35% of the issues present in the system (Nielsen, 1994). Furthermore, different evaluators tend to find different problems. From the curve shown below, Nielsen demonstrates that the optimal number of evaluators is 3–5. While you may uncover some additional issues by adding more than 5 evaluators, depending on how critical and complex the system to be evaluated is, there is a greater likelihood of overlapping issues found with that of other evaluators. In other words, there are diminishing returns in a cost-benefit analysis as shown below.

Source: Nielsen (1994) Curve showing the proportion of usability problems in an interface found by heuristic evaluation using various numbers of evaluators. The curve represents the average of six case studies of heuristic evaluation.
Source: Nielsen (1994) Curve showing how many times the benefits are greater than the costs for heuristic evaluation of a sample project using the assumptions discussed in the text. The optimal number of evaluators in this example is four, with benefits that are 62 times greater than the costs.

Pros and cons

As with any method, there are of course advantages and disadvantages. This list is derived from the literature found over at the Interaction Design Foundation (IDF): What is Heuristic Evaluation?

Pros:

  • Evaluators can focus on specific issues.
  • Evaluators can pinpoint issues early on and determine the impact on overall UX.
  • You can get feedback without the ethical and practical dimensions and subsequent costs associated with usability testing.
  • You can combine it with usability testing.
  • With the appropriate heuristics, evaluators can flag specific issues and help determine optimal solutions.

Cons:

  • Depending on the evaluator, false alarms (noted issues that aren’t really problems) can diminish the value of the evaluation (Use multiple evaluators!).
  • Standard heuristics may not be appropriate for your system/product — validating new heuristics can be expensive.
  • It can be difficult/expensive to find evaluators who are experts in usability and your system’s domain.
  • The need for multiple evaluators may make it easier and cheaper to stick with usability testing.
  • It’s ultimately a subjective exercise: findings can be biased to the evaluator and lack proof, recommendations may not be actionable.

Note the pro: “You can combine it with usability testing”. When you’re conducting a usability test, your prototype is your hypothesis. If you implement a heuristic evaluation correctly, you can catch and fix low-hanging fruit in terms of usability issues, thereby refining your hypothesis before you take it to users. Fixing these before testing allows your participants to identify usability issues from the first-person perspective of the persona, rather than recruiting users to find the kinds of issues that you should have caught yourself.

But let’s not forget to take note of the cons. False alarms as a result of issues found by an evaluator can be problematic and diminish the overarching results of the evaluation. This is yet another reason why multiple evaluators are crucial to making your heuristic evaluation worthwhile. False alarms can often be identified and disregarded when evaluators come together to synthesize and prioritize findings.

Conclusion

Heuristic evaluations are a mainstay of usability engineering and user experience research. Though considered a ‘discount’ method, there are a lot of upfront considerations in order to make the most of them. Using heuristic evaluations as a precursor to usability testing can help improve the return on investment for both, as every issue uncovered and solved with heuristics will allow your users to note other issues from their perspective. In sum, you are not your user, neither are your evaluators. Using heuristic evaluations in conjunction with usability testing will iron out a lot of the kinks before you show it to the user. With these issues already solved for, feedback from usability testing can generate deeper insights to really dial in the design, improving the ROI from both the heuristic evaluation and the usability test.

Sources

Bertini, E., Catarci, T., Dix, A., Gabrielli, S., Kimani, S., & Santucci, G. (2009). Appropriating Heuristic Evaluation for Mobile Computing. International Journal of Mobile Human Computer Interaction, 20–41.

Gaal, N. (2017, 06 19). Heuristic Analysis in the design process. Retrieved from UX Collective: https://uxdesign.cc/heuristic-analysis-in-the-design-process-usability-inspection-methods-d200768eb38d

Nielsen, J. (1994, 1 1). Guerrilla HCI: Using Discount Usability Engineering to Penetrate the Intimidation Barrier. Retrieved from NN/g Nielsen Norman Group: https://www.nngroup.com/articles/guerrilla-hci/

Nielsen, J., and Molich, R. (1990). Heuristic evaluation of user interfaces, Proc. ACM CHI’90 Conf. (Seattle, WA, 1–5 April), 249–256.

Nielsen, J. (1994, 11 1). How to Conduct a Heuristic Evaluation. Retrieved from NN/g Nielsen Norman Group: https://www.nngroup.com/articles/how-to-conduct-a-heuristic-evaluation/

Quiñones, D., Rusu, C., & Rusu, V. (2018). A methodology to develop usability/user experience heuristics. Computer Standards & Interfaces, 109–129.

Soedgaard, M. (2020, 07 19). What is Heuristic Evaluation? Retrieved from Interaction Design Foundation: https://www.interaction-design.org/literature/topics/heuristic-evaluation

The UX Collective donates US$1 for each article published in our platform. This story contributed to UX Para Minas Pretas (UX For Black Women), a Brazilian organization focused on promoting equity of Black women in the tech industry through initiatives of action, empowerment, and knowledge sharing. Silence against systemic racism is not an option. Build the design community you believe in.

--

--

Equitable Mobility Service Designer at Walker-Miller | fmr. NASA ExMC & CAS+, Generative Justice Lab