The Panlingual Camera Phone: Interactive Prototype, Version 1

Martin Hecko, Kinsley Ogunmola, Jonathan Pool, Tim Wong, Peter Woodman

Project sites:
http://www.cs.washington.edu/education/courses/cse490f/07wi/project_files/camera/
http://panlingual.org/ (hosted by Utilika Foundation)

Report URLs:
http://www.cs.washington.edu/education/courses/cse490f/07wi/project_files/camera/rereproto/
http://panlingual.org/rereproto/

Tasks

We are developing a user interface for a panlingual camera phone. You can test the interactive prototype, version 1, on these tasks:

"Taking a Bus" (easy): While sightseeing, you are at a bus stop. A bus approaches, with a foreign-language destination sign. You want an English translation, to decide whether to take the bus.

"Ordering a Meal" (moderate): You are traveling in a foreign country and are at a restaurant with a menu in, and waiters who know, a different foreign language. You have a note in English asking for the vegetarian dishes to be pointed out. You want your note translated into the restaurant's language, to show it to the waiter.

"Exploring Your Ancestry" (difficult): You are visiting a cemetery where an ancestor is believed buried and have found the likely tombstone, with a foreign-language inscription. You want an English translation and want to explore the inscription and its translation, to see parts of the inscription more clearly, estimate the translation's reliability, and learn some words in the language.

Overview of UI Design Changes

Version 1 continues development with incremental improvements to address the most severe problems discovered in the heuristic evaluation of version 0. The result is an interface with more visible help, more accessible information, and more user control.

The system for which we are designing an interface is an application that runs on a camera-equipped mobile telephone (to be referred to as the "device" below). The application's basic function is to act as a panlingual camera, translating texts in photographs that the user takes from their original ("source") languages into any other ("target") languages. We shall refer to this application as "PanCam" below. The current design assumes that the device has a touch-sensitive display. Our prototype is laid out in accord with one such device currently on the market.

In version 1 of the prototype, the interface has eight states. Controls permit the user to navigate among them and thereby accomplish desired tasks. The states are:

  1. Start: Ready for user to take a photo.
  2. Result: Displays the photo and a translation of the text in it.
  3. Annotated Result: Same as Result, plus annotations showing how one word was translated.
  4. Zoom Start: Ready for user to indicate the start of what PanCam should zoom in on.
  5. Zoom End: Ready for user to indicate the end of what PanCam should zoom in on.
  6. Zoomed In: Same as Result, limited to a user-specified word sequence, magnified.
  7. Source Correction: Ready for user to correct PanCam's guess of the source language.
  8. Target Choice: Ready for user to change the target language or request the OCR output instead.

These states are illustrated in Figure 1 and in the corresponding larger-scale figures in the Appendix. Each small-scale state illustration links to its large-scale counterpart.

Start
Start
Result
Result
Annotated Result
Annotated Result
Zoom Start
Zoom Start
Zoom End
Zoom End
Zoomed In
Zoomed In
Source Correction
Source Correction
Target Choice
Target Choice

Figure 1. States of Interface in Prototype Version 1.

The storyboards in Figure 2, linked to large-scale versions in the Appendix, show the paths that version 1 of the prototype permits the user to follow from state to state, illustrated for the three demonstration tasks.

Taking a Bus
Taking a Bus
Ordering a Meal
Ordering a Meal
Exploring Your Ancestry
Exploring Your Ancestry

Figure 2. Interface Navigation Paths in Prototype Version 1.

Not implemented in version 1 of the prototype are some other states. These include states in which context-sensitive help is displayed; PanCam instructions and training are given; the user retrieves, annotates, organizes, and submits for translation previous photos and texts; and the user supplies texts for translation by typing or speaking it rather than photographing it.

Major Usability Problems Addressed

Version 1 of the prototype addresses seven problems noted in the heuristic analysis. Each problem is labeled below with the number it has in the heuristic-analysis report.

Most severe (level-4) problems:

Next-most severe (level-3) problems:

Appendix

Start State
Figure A1a. Start State.

Result State
Figure A1b. Result State.

Annotated Result State
Figure A1c. Annotated Result State.

Zoom Start State
Figure A1d. Zoom Start State.

Zoom EndState
Figure A1e. Zoom End State.

Zoomed In State
Figure A1f. Zoomed In State.

Source Correction State
Figure A1g. Source Correction State.

Target Choice State
Figure A1h. Target Choice State.

Taking a Bus
Figure A2a. Taking a Bus.

Ordering a Meal
Figure A2b. Ordering a Meal.

Exploring Your Ancestry
Figure A2c. Exploring Your Ancestry.

Valid XHTML 1.1!