topic: interface design

description; ?>

[the following is a response to a thoughtful and provocative essay by the wonderful folks at Johnny Holland, where Stephen Anderson attemps to outline an informatics-based behaviour influence/modification system for email, and I contend, arguing that it's technology, not behaviour, that needs to be fixed. you should read the article, followed by my comment and Stephen's response in order for the following to make sense.]

Stephen writes:

If I was designing a new email platform from the ground up with these things built in, the execution would probably be different– I’d build some of these ideas into the architecture of the system– take a “break away from the conventional ideas that have got us in this mess, ” to quote the article. But, that would introduce a bigger problem: asking people to change their email platform (which is a much, much more daunting challenge than the gaming I describe in the article!)

Yes, its a tricky business1. Create a temporary fix on a broken system and risk cementing it even further, or create an entirely new system and cause upheaval? I don't have any good answers, but I suspect a redesign of email needn't cause much upheaval at all, and in fact might make things even more invisible.

Stephen further comments on context:

Your comment about context seems to me the more challenging one– and a critical consideration.

Context is a tricky business. Paul Dourish has written extensively on context; if you can survive his invented terminology ('technomethodology'?!!) - and 'where the action is' is an excellent read - he has interesting things to say about context, the most consequential of which is that context is created through interaction. What this means for our email quandary is that email software could become much better at knowing what to support by paying attention to its interactions with the user.

To elaborate: most (desktop) software at present is usually state-ful but path-agnostic. If you think of software as something that's a set of states (displaying email, writing email, downloading attachments etc), then interaction is a path through software states. [most web software is actually stateless on the server side, which is why cookies are used to track state on the client side].

Most software doesn't actually track the path through a particular state was reached. For instance, when composing a new email, it doesn't matter if you are replying to an existing email or starting a new thread, whether you were just viewing a project folder or your inbox. The email compose window functions the same way, and what it does and presents to you are largely identical in all cases (if you use the postbox email client, the sidebar always shows all attachments, regardless of who is involved in the email.)

But if email software were to act differently depending on the path taken to the current state, then each state has actually a lot more information to act on, and this makes for opportunities to better understand what the user is doing and adapting accordingly. So, if you were viewing a project folder and composed a new email, that email could get automatically put in the project folder. Or if an email is referred to repeatedly during the course of a day to then suggest showing it in a more persistent view. Or when responding repeatedly to emails from a particular person, to prioritise new mail notifications from that person. (these are just crude examples: actual behaviour would probably have to be a lot more sophisticated)

These are ways to be path-sensitive (and hence interaction-sensitive, and context-sensitive) within an application. But the notion of context extends a little further than that - full context-sensitivity must, I think, consider:

  • interaction paths (whether within or across applications)
  • contemporaneity (what else is being interacted with/running/happening)
  • tendencies (what happens more or less often - this is where most personal informatics focuses, and its the idea behind gmail labs' ‘Bob’ features - ‘Don't forget Bob’ and ‘Got the wrong Bob?’ and Firefox's awesome bar)
  • interaction patterns (sequences that are semantically meaningful, even if not constantly repeated)
  • organizational structures (or information relationships)

[This is not even taking into account the place (say, a meeting room), the people involved (and their relationship to you), and the actual content of the email (or whatever else) itself. Which is a discussion for another day.]

This is where we can return to the central quandary posed by the personal informatics behaviour modification system Stephen proposes: bandage a broken system or force re-learning? I think that this may be a non-issue: if email clients are ‘smarter’ in this sense, we might be able to use the same interface to deliver much more complex behaviour. So, when Stephen writes

For one person, 10 emails a day is the norm. For someone else, juggling several 100 emails a day may present no problem

he's actually thinking of how to make an interface that has to work for different people with different behaviours. But this notion of context-sensitivity suggests interfaces that behave differently depending on, say, the number of emails a day, and so work for the same person when they receive 10 emails a day equally well as when they receive a hundred. (That's within-subject, not across-subject variance, for you psych geeks).

Which also brings us to the issue of whether normative behaviour modification when it comes to email is a good idea in the first place: email use co-evolves with the existence of other collaboration & communication tools, and some of the reasons for behaviour modification (or context-sensitivity, for that matter) might no longer exist, and the associated behaviours might simply cease to be. 

This is a worthy experiment. Are there any interaction designers, inspired engineers or tinkerers who want to take this on? I'm interested in developing this further. 

Notes:
1. Gratuitious arguments using the Chandler Project will be summarily ignored. 

  1. recognize proper nouns, especially when followed by ’s, and treat them appropriately when using "change all"
  2. detect and correct word variations during a spell-check session without having to add them to the dictionary; especially important for occasional proper nouns
  3. detect delayed shift-release capitalization errors, especially for proper nouns (e.g. "my name is ARvind")
  4. detect typing error patterns within a document and adjust spelling/grammar suggestion accordingly
  5. use phrase patterns to refine suggestions ("in touh with" is probably "in touch with", not "in tough with")
  6. detect spacebar order errors ("this is n ot the case")
  7. infer multiple whitespace usage conventions from the document (don't ask each time you encounter e.g. "and so ,   we can" to change to "and so, we can" once I make the change)
  8. show me what "change all" will change, especially if it's a grammatical suggestion and/or there are variations
  9. turn off "resume" mode when clicking into the text area after loss of window focus
  10. detect compound grammar & spelling errors (e.g. "she is alectureer" should suggest "a lecturer" instead of just "lecturer" to avoid introducing a grammatical error)
  11. detect typing errors based on keyboard button proximity patterns (e.g. "traip" is more likely to be "trail" than "train")

[update: the Office Natural Language team has a post on the issue]

looking at online media services it strikes me: what use is an aggregator without a queue? why is instapaper not part of youtube and google reader and twitter?

who uses a modal window for an autosave progress indicator, constantly stealing focus from ongoing user actions? microsoft does, in mac office 2008. so much for usability testing.

Multiple virtual screen window management systems really need to figure out what to do when using external monitors. Currently, all such systems (including Apple's Spaces) simply multiply the number of virtual workspaces by the number of physical screens. However, using an additional physical screen is distinctly different from using several invisible virtual ones: physical screens are used for multiple simultaneous attentional areas, but virtual screens can only be used when intentionally switching attention.

Sometimes, virtual screens are used for peripheral attention, with the help of notification systems like Growl, but that is often the case because an extra physical screen is not available. Most virtual workspace managers have options to auto-position application-specific windows onto a virtual screen. That's great for when the number of physical screens never changes, but ends up using less display real-estate when switching from one screen to two (as in the case of laptops with an external monitor).

What is really needed is closer to attention to how people position paper and other information displays on a workspace, and design positioning algorithms around that. For instance, the sketch above shows a window manager automatically moving a specified window from invisible (1) to peripheral (2) space when an external monitor is attached.

More reasons for field research in user experience work...

Update: I hunted through the CHI and related archives, and found only one work that looked at window management strategies & multiple/virtual desktops empirically:

Ringel, M. (2003). When one isn't enough: an analysis of virtual desktop usage strategies and their implications for design. In CHI '03 extended abstracts on Human factors in computing systems (pp. 762-763). Ft. Lauderdale, Florida, USA: ACM. doi: 10.1145/765891.765976.

and two others that compared usability in terms of space usage and task switching:

Hutchings, D. R., Smith, G., Meyers, B., Czerwinski, M., & Robertson, G. (2004). Display space usage and window management operation comparisons between single monitor and multiple monitor users. In Proceedings of the working conference on Advanced visual interfaces (pp. 32-39). Gallipoli, Italy: ACM. doi: 10.1145/989863.989867.

Truemper, J. M., Sheng, H., Hilgers, M. G., Hall, R. H., Kalliny, M., & Tandon, B. (2008). Usability in multiple monitor displays. SIGMIS Database, 39(4), 74-89. doi: 10.1145/1453794.1453802.  

Most of the others were, unfortunately, proposals for new/redesigned window systems based on task/action level inspiration. Some, like the Hutchings [1] paper, and BumpTop suggest new window management actions: this still ignores the role of the computer in organizing windows, and pays more attention to user-initiated actions. While there are works on spatial memory, attention and activity-based window organization [2] I couldn't find anything that puts information organization, task/activity structure, memory & attention, and object/window manipulation together to treat the desktop as a workspace. Interesting how the metaphor lost almost all salience in translation...

What's interesting is that work I did on workspaces while at Steelcase informs most of these issues, even though we were not remotely aiming for that. I guess I'll just have to write a CHI paper, then.

References

Chapuis, O., & Roussel, N. (2005). Metisse is not a 3D desktop! In Proceedings of the 18th annual ACM symposium on User interface software and technology (pp. 13-22). Seattle, WA, USA: ACM. doi: 10.1145/1095034.1095038.

[1]Hutchings, D. R., & Stasko, J. (2002). QuickSpace: new operations for the desktop metaphor. In CHI '02 extended abstracts on Human factors in computing systems (pp. 802-803). Minneapolis, Minnesota, USA: ACM. doi: 10.1145/506443.506605.

Hutchings, D. R., & Stasko, J. (2004a). Revisiting display space management: understanding current practice to inform next-generation design. In Proceedings of Graphics Interface 2004 (pp. 127-134). London, Ontario, Canada: Canadian Human-Computer Communications Society. Retrieved June 16, 2009, from http://portal.acm.org/citation.cfm?id=1006058.1006074&coll=Portal&dl=ACM&CFID=41002027&CFTOKEN=29749222.

Hutchings, D. R., & Stasko, J. (2004b). Shrinking window operations for expanding display space. In Proceedings of the working conference on Advanced visual interfaces (pp. 350-353). Gallipoli, Italy: ACM. doi: 10.1145/989863.989922.

Khan, A., Matejka, J., Fitzmaurice, G., & Kurtenbach, G. (2005). Spotlight: directing users' attention on large displays. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 791-798). Portland, Oregon, USA: ACM. doi: 10.1145/1054972.1055082.

Robertson, G., Czerwinski, M., Larson, K., Robbins, D. C., Thiel, D., & Dantzich, M. V. (1998). Data mountain: using spatial memory for document management. In Proceedings of the 11th annual ACM symposium on User interface software and technology (pp. 153-162). San Francisco, California, United States: ACM. doi: 10.1145/288392.288596.

Robertson, G., Horvitz, E., Czerwinski, M., Baudisch, P., Hutchings, D. R., Meyers, B., et al. (2004). Scalable Fabric: flexible task management. In Proceedings of the working conference on Advanced visual interfaces (pp. 85-89). Gallipoli, Italy: ACM. doi: 10.1145/989863.989874.

Stuerzlinger, W., Chapuis, O., Phillips, D., & Roussel, N. (2006). User interface façades: towards fully adaptable user interfaces. In Proceedings of the 19th annual ACM symposium on User interface software and technology (pp. 309-318). Montreux, Switzerland: ACM. doi: 10.1145/1166253.1166301.

Tashman, C. (2006). WindowScape: a task oriented window manager. In Proceedings of the 19th annual ACM symposium on User interface software and technology (pp. 77-80). Montreux, Switzerland: ACM. doi: 10.1145/1166253.1166266.

[2] Bardram, J., Bunde-Pedersen, J., & Soegaard, M. (2006). Support for activity-based computing in a personal computing operating system. In Proceedings of the SIGCHI conference on Human Factors in computing systems (pp. 211-220). Montréal, Québec, Canada: ACM. doi: 10.1145/1124772.1124805.

there appears to be no way to ensure a consistent spaces & exposé experience across a macbook and an external mac keyboard, because they are hardwired to have different shortcuts. of course, the control panel doesn't allow additional shortcuts to be defined for the spaces/exposé actions, with the end result that i'm forced to rewire my fingers, to use a keyboard that has more keys than i need - just not the right ones.

meh, apple.

theorem: there exists no (time, system resource, interaction, cognitive load) efficient method to perform a breadth-first search on amazon using the wimp+tabbed browser paradigm

corollary: 10 years after google invented pagerank, we still don't know how to operate on internet graph structures as first-class interaction objects.

RTFM, reprise, originally uploaded by steelmonkey.

An on-flight entertainment system that includes instructions for it's own use. Interestingly, the menu button which brings up all the interesting things to do is the only one unmarked. Was this arrived at in trying to hide complexity by foregrounding the viewing controls in comparison with everything else? It only ends up making half the functionality invisible. Lesson: reduction of complexity is not the same as simple design

Gesturing has gotten a bad name: it's become interpreted to mean any action performed with a pointer that is not a direct manipulation, and consequently measured in terms of mathematical and statistical properties of a time-series of pointer locations. This creates things like touchscreen flicking (good), graffiti & similar gesture-based writing (interesting but not terribly intuitive) and firefox mouse gesture extensions (send out the SOSs).

But ''human'' gesturing - deixis - is both a lot more complex and a lot simpler. Deixis serves two purposes: a) it identifies & addresses[1], and b) it illustrates.

Consider identification & addressing (we will deal with illustration later). When we say something like "give me ''that''" and point at an object, we use a speech marker (the deictic "that") to indicate that something is being addressed, and we are using the shape of the arm & hand and its orientation to identify it. But pointing is not necessary: the "that" can also refer to something which has been mentioned previously in the conversation, or the same sentence (the [http://en.wikipedia.org/wiki/Anaphora_%28linguistics%29 anaphoric] "that"). In both cases, the use of "that" depends on the act of pointing or a supporting linguistic context; in the case of pointing, the deictic ''that'' and the extended arm are co-dependent - neither makes much sense without the other.

Consider the WIMP & direct manipulation mode of interaction: it relies on a basic vocabulary of pointing, clicking, and dragging with a mouse/pen/tablet or finger; the keyboard is primarily used to navigate within the set of clickable/draggable/pointable interactive objects. Combine these actions wilfully and you get such wonders as 9 button mice, triple and quadruple clicks, and, of course, mouse gestures. Interaction with this vocabulary does not have a grammar: any action can be followed by any other action (this is ''pointing''). Consider the command-line mode of interaction: it is linguistic (well, it has a grammar), relies heavily on the keyboard, and has methods of addressing, usually through a variable passing mechanisms and naming patterns (in other words, it has the second kind of deictic ''that'').

Two different systems of interaction, designed (unintentionally, one cynically suspects) to be separate and mutually exclusive. And yet, each an essential component of completely ordinary, effortless and extremely effective acts of deixis that people use with each other every day. [2]

That's all you need: pointing and addressing, performed simultaneously or in some standard sequence. For instance, here's what a truly deictic method of renaming a bunch of files could look like [3]:

1. you start typing: "take these "
2. "these" becomes bold - "'''these'''" - indicating that a deictic signifier has been encountered
3. you continue typing: "files " - the text becomes "'''these files'''"
4. you switch to a file browser window and select an arbitrary bunch of files
5. you continue typing: "and rename, replacing 'today' with 'tomorrow' if modified today" and press enter

Compare this with how you would have done it using only the WIMP method, or only the command line (I leave mocking-up or prototyping this to a more enthusiastic soul than I)

Interaction Design, meet Deixis. Deixis, Interaction Design.

Notes:
[1] 'addressing' here has two senses; in human speech, identifying a subject, or directing a remark; in computer science: specifying the location of some information
[2] there is no point discussing how tangible interaction methods will change all this unless computing starts to happen in the environment instead of on a two-dimensional display
[3] at least one problem will get in the way of making this possible right now: the notion of focus (as it pertains to interface controls); no window manager has has worked out a good (i.e. consistent & predictable) method of switching focus while performing a pointing (clicking, dragging, etc.) action

References:
1. The Mozilla [http://labs.mozilla.com/projects/ubiquity/ Ubiquity] Project
2. [http://humanized.com/enso/ Enso], [http://www.blacktree.com/ Quicksilver,] [http://do.davebsd.com/ Gnome-Do] and others
3. Blas N.D.[1], & Paolini P.[1]. (2003). There and back again: What happens to phoric elements in a `web dialogue'. Document Design, 4, 194-206. doi: 10.1075/dd.4.3.01di

0321_172409

the display (oriented towards the purchaser) on this point of sale terminal counts upto $99,999.99 - however, this is in a neighbourhood grocery store, where the chance of such a large sale is, let's say, highly unlikely... would have been much more useful to have used a portion of the display to show the last item scanned along with the total...

Syndicate content