Tullis 2002 Commentary
An important excerpt from an upcoming publication on Uzilla.Since instrumentation is difficult and extraction of knowledge from the resulting data set perhaps more difficult, it is worth considering the role event logs play in successful usability testing protocols. One avenue for isolating such a contribution is through comparisons of lab-based and remote testing protocols. In “remote usability testing”, participants are typically at their own workstations. In some scenarios, a voice connection is made to a test administrator; usability professionals have also reported using screen sharing programs to view a remote participant’s screen.
Tullis, et al. (2002) compared two tests in lab and remote settings with samples of 8 lab users and 29 remote users in experiment 1, and 8 and 88, respectively, in experiment 2. With only limited instrumentation, they discovered that both methods revealed the same core problems and resulted in similar task time and success measures. Each protocol revealed some problems that the other did not, with the lab testing generally revealing more problems in experiment 1, given a 5x larger sample in the remote test, and remote testing producing more unique problems in experiment 2 with an 11x times larger sample size.
The technological solution used by Tullis, et al. did not allow them to circumvent the browser security restrictions, preventing recording of the page paths and intra-page activity. Thus, their comparison may have under-evaluated the potential of remote testing. The problems that were uniquely identified were summarized as: "We saw evidence of certain kinds of user behaviors in the lab (e.g., excessive scrolling, failure to see certain elements on the screen at first) that were less likely to be captured in the remote tests." Excessive scrolling can be recorded in an instrumented browser and a failure to see a critical element might be identified through mousing behavior.
The Tullis, et al. study provides a strong suggestion that remote testing can be a fruitful way to do usability testing and that the potential for increased sample sizes can produce more robust results. Their results also suggest that comment data is critical. Ebling & John (2000) analyzed the source of usability problem identifications from quantitative versus protocol data in a single usability test. Protocol data was found to contribute uniquely and to the greatest extent to problem identification although quantitative measures replicated the Tullis, et al. finding of a high detection rate for major issues.
Based upon these studies, it does seem advisable to solicit user opinion on the ease of accomplishing a task in a usability test. Thus, future work should compare real time verbal protocol with post-task surveying of ease of use. Remote testing protocols seem to have value, but real time observation may prove to be superior in many cases. Remote testing may also best be done with a related protocol, termed desktop usability testing, in which a test administrator brings the lab to the user by way of instrumented software. This preserves many of the benefits in remote testing that lead to greater number of participants.
Given that instrumentation could be a valuable contribution to usability testing processes, this discussion will turn to the mechanics of instrumentation for web usability testing. Instrumentation for web browsers, while buttressed by the common logging facilities of web servers, has remained an elusive goal with technical flaws limiting the impact of conducted research.|||
Tullis, T, Fleischman, S., McNulty, M, Cianchette, C. and Bergel, M. (2002). An Empirical Comparison of Lab and Remote Usability Testing of Web Sites. Usability Professionals Conference, Pennsylvania, 2002.
Ebling, Maria R. & John, Bonnie E. (2000). On the Contributions of Different Empirical Data in Usability Testing. Conference proceedings on Designing interactive systems. NY, NY. DOIhttp://doi.acm.org/10.1145/347642.347766
<5:43:35 PM>
Posted at NaN:NaN

