February 2009

RTFM, reprise, originally uploaded by steelmonkey.

An on-flight entertainment system that includes instructions for it's own use. Interestingly, the menu button which brings up all the interesting things to do is the only one unmarked. Was this arrived at in trying to hide complexity by foregrounding the viewing controls in comparison with everything else? It only ends up making half the functionality invisible. Lesson: reduction of complexity is not the same as simple design

Gesturing has gotten a bad name: it's become interpreted to mean any action performed with a pointer that is not a direct manipulation, and consequently measured in terms of mathematical and statistical properties of a time-series of pointer locations. This creates things like touchscreen flicking (good), graffiti & similar gesture-based writing (interesting but not terribly intuitive) and firefox mouse gesture extensions (send out the SOSs).

But human gesturing - deixis - is both a lot more complex and a lot simpler. Deixis serves two purposes: a) it identifies & addresses[1], and b) it illustrates.

Consider identification & addressing (we will deal with illustration later). When we say something like "give me that" and point at an object, we use a speech marker (the deictic "that") to indicate that something is being addressed, and we are using the shape of the arm & hand and its orientation to identify it. But pointing is not necessary: the "that" can also refer to something which has been mentioned previously in the conversation, or the same sentence (the anaphoric "that"). In both cases, the use of "that" depends on the act of pointing or a supporting linguistic context; in the case of pointing, the deictic that and the extended arm are co-dependent - neither makes much sense without the other.

Consider the WIMP & direct manipulation mode of interaction: it relies on a basic vocabulary of pointing, clicking, and dragging with a mouse/pen/tablet or finger; the keyboard is primarily used to navigate within the set of clickable/draggable/pointable interactive objects. Combine these actions wilfully and you get such wonders as 9 button mice, triple and quadruple clicks, and, of course, mouse gestures. Interaction with this vocabulary does not have a grammar: any action can be followed by any other action (this is pointing). Consider the command-line mode of interaction: it is linguistic (well, it has a grammar), relies heavily on the keyboard, and has methods of addressing, usually through a variable passing mechanisms and naming patterns (in other words, it has the second kind of deictic that).

Two different systems of interaction, designed (unintentionally, one cynically suspects) to be separate and mutually exclusive. And yet, each an essential component of completely ordinary, effortless and extremely effective acts of deixis that people use with each other every day. [2]

That's all you need: pointing and addressing, performed simultaneously or in some standard sequence. For instance, here's what a truly deictic method of renaming a bunch of files could look like [3]:

1. you start typing: "take these "
2. "these" becomes bold - "these" - indicating that a deictic signifier has been encountered
3. you continue typing: "files " - the text becomes "these files"
4. you switch to a file browser window and select an arbitrary bunch of files
5. you continue typing: "and rename, replacing 'today' with 'tomorrow' if modified today" and press enter

Compare this with how you would have done it using only the WIMP method, or only the command line (I leave mocking-up or prototyping this to a more enthusiastic soul than I)

Interaction Design, meet Deixis. Deixis, Interaction Design.

Notes:
[1] 'addressing' here has two senses; in human speech, identifying a subject, or directing a remark; in computer science: specifying the location of some information
[2] there is no point discussing how tangible interaction methods will change all this unless computing starts to happen in the environment instead of on a two-dimensional display
[3] at least one problem will get in the way of making this possible right now: the notion of focus (as it pertains to interface controls); no window manager has has worked out a good (i.e. consistent & predictable) method of switching focus while performing a pointing (clicking, dragging, etc.) action

References:
1. The Mozilla Ubiquity Project
2. Enso, Quicksilver, Gnome-Do and others
3. Blas N.D.[1], & Paolini P.[1]. (2003). There and back again: What happens to phoric elements in a `web dialogue'. Document Design, 4, 194-206. doi: 10.1075/dd.4.3.01di