Here are a few tips on conducting effective user evaluations of your software.
There are two major approaches: Observation and Questionnaire.
Observation is best done with a relatively small group of users (less than 20). It is most often used early on in the evaluation process to get a general idea of what works and what does not. In the observation, an observer watches a tester go through the program and takes notes on how the user does, problems he encounters, etc. The observer should give only minimal instructions to the tester and only intervene when the user gets stuck and can't continue.
One good technique to use in observation is to ask the user to vocalize or think aloud as he is going through the program. That way the observer can get an idea of what is going through the user's head—puzzlement, thinking certain features are easy or difficult to use, etc.
Questionnaires are best used with a larger group and are usually a tool to get more detailed information about how the software holds up in heavier use. In general, the more testers you have the more reliable your feedback.
Try to keep your questionnaire to a reasonable length, usually no more than two pages. This way you respect the tester's time and avoid generating too much information to have to digest. Refine your questionnaire to include the most important information you want to gather.
An impressionistic analysis is probably more useful in this kind of evaluation than a formal statistical analysis. Remember that you are largely evaluating user experience—a mostly subjective judgment. The other thing you are looking for is bugs that you didn't catch during your own testing. Statistics are of little relevance when it comes to bugs; even if only one user encounters a particular bug, you should fix it.
If your questionnaire contains numerical responses, it is usually sufficient to calculate only totals and averages for these responses. This helps you get a general impression about the positive or negative experience of the user on that particular feature. For example, if a question asks the user to evaluate a particular feature on a scale of 1 to 5, finding the average response over all users can give you a good idea of how successful that feature was. A score of 4.8 would indicate that it was well-received; a score of 2.5 would tell you that feature had problems and might need to be revised.
In any case, even with numeric responses it is useful to leave space for the user to give a written comment. That helps you to focus on what it was about a given feature that people liked or didn't like.
See some evaluation questionnaires and observation guidelines that the intstructor has used over the years, as well as some good examples of instruments created by students of this course in the past.
How to Test Your Stacks An old, but still very good, guide to conducting informal usability testing, from Jacqueline Landman Gay, a long-time HyperCard and LiveCode developer.
"Should Testers Be Allowed to Do Their Job?" Insights on the testing process and the resources software developers devote to testing.
See the section on Mobile Usability Test Tips and Tricks, in our text, Brian Fling, Mobile Design and Development, p. 297-298.
Final Project Evaluation Assignment