[pageinfo chapter=”hybrid-devices” chapnos=”9″ chaptitle=”Hybrid Devices”/]
Abstraction — software as if hardware doesn’t matter
The computer being used to write this chapter is a laptop with a trackpad. The awkward angles needed to use the trackpad continuously lead to pains in the wrist and hands, so there is also a wireless mouse. However, when swopping between mouse and trackpad there is no need to swop software, the same wordprocessor works equally well with both. If this seems unsurprising to you, think again, how different a mouse and trackpad are, it is like going to a restaurant being given a fork instead of a soup spoon. Even more amazing, the same web pages you can use on the laptop with mouse or trackpad can also be used on the phone, clicking links using your finger alone.
Now this seems so obvious you need to really step back to realise it is surprising at all. However, things were not always this way and the identification of abstract devices was crucial to the development of the modern user interface.
Early terminals had keyboards, so there was an easy first abstraction, text entry, as all had the basic alphabet and numbers. However, even here there were numerous differences from plain keyboards, to those with special keys for a particular function, even cursor keys came slowly, and so early software often had different mappings from keys and key combinations to actions. Now-a-days this is virtually obsolete, partly from keyboard standardisation and partly due to the increasing reliance on the mouse (or trackpad!), but the remnants of this can still be seen in the slightly different mappings on Mac and Windows software.
The origins of the mouse date back to the early 1960s with Douglas Englebart‘s innovative Augmentation Research Center. While the mouse used in the [hilite-pink]very first Macintosh computer[/hilite-pink] [comment]mention the Xerox Star? [/comment], it was in fact some time before it became more widely used in personal computers, largely because most did not have graphical windows-based user interfaces and instead were based on character maps.
However, before the desktop computer had become commonplace, there were a variety of high-end graphical workstations for use in specialised areas such as CAD and scientific visualisation. These often needed some way to draw, and select lines and areas on screen, but varied tremendously in the devices used to achieve this. Some had light-pens which could be used to directly touch points on the screen, or draw on the screen as if it were paper. Others used tablets with ‘pucks’, a bit like a mouse except that the position of the puck on the tablet map directly to positions on the screen, whereas with the mouse it is only the movement that matters. Some CAD workstations had small joysticks, or two small thumbwheels: one controlling horizontal ‘x’ location and for the vertical ‘y’ location.
This variety of hardware did not matter when computers were so specialised that software and hardware were delivered together as a package. However, in the late 1960s and early 1970s those working in computer graphics began to look for more generic ways to describe and build interactive graphical software and identified a number of key functions all of which could be achieved using a combination of a keyboard and some form of pointing device [[Ne68, Wa76]]. Low-level software converted the varying signals generated by the different forms of pointing device into a uniform digital format. By the time the current windows-based operating systems (X/Linux, MacOS, Windows) were developed, they all have generic ‘mouse’ devices, which in fact can be just about anything that generates an x–y coordinate and has the means to select through pressing a button or otherwise.
This abstraction away from hardware has been incredibly successful, and because of it, when laptops appeared that used alternative devices instead of the mouse: trackballs, keyboard nipples and trackpads, there was no need to develop new software, simply a ‘driver’ for the new device that made it appear to the software just the same as the mouse. Similarly direct screen devices such as styluses on table-computers or direct finger touch interactions are ‘just like a mouse’ to the word processor or web browser with which you are interacting.
This process has continued when radically new hardware capabilities have appeared, for example, both MacOSX and Windows now provide the programmer with generic scroll wheel events. Multi-touch devices, such as the iPhone that use two finger gestures, have proved challenging as two fingers are definitely not the same as two mice (and anyway how many systems use two mice!). Apple providing both software and hardware are in a similar position to early graphics workstation manufacturers, able to tune the software to particular hardware characteristics, whereas Microsoft wish to provide a variant of Windows that can run on any multi-touch hardware and so had to work out new abstractions over the common features [[WM09]] .
[[@fg:light-pen]]Light pen in use
The limits of hardware abstraction
It would be very surprising in a book about physicality if we ended up saying the precise nature of hardware doesn’t matter. Abstractions are both theoretically elegant and practically useful, and the importance and utility of suitable ways to abstract away from the specific of devices should not be underestimated. However, there are limitations: a mouse is NOT the same as a trackpad (if it were there would be no reason to use buy a mouse to go with the laptop, similarly all phone key pads are not the same; TV remotes, washing machine controls.
Very early on in the quest for abstractions over keyboards and pointing devices, there were voices warning about these limitations. Bill Buxton was one of these and pointed put differences in both the intrinsic capabilities of the then popular pointing devices and also in the ease with which different devices could be used for specific tasks.
The first of these, the intrinsic capabilities, showed that pointing devices differ even at a very abstract level. Buxton distinguished three states [[Bu90]]:
- state 0 — no tracking, no pointer shown on screen. e.g. a light pen or touch screen when the pen or finger is far from the screen
- state 1 — tracking, pointer shown on screen, but just to show you where it is. e.g. a mouse when no buttons are pressed.
- state 2 — dragging, pointer shown and something happening such as a window being moved, or a line being drawn. e.g. a mouse when the button is held down
A mouse is only ever in states 1 or 2, whereas the light pens available at the time could have all three states: not in contact with the screen, so no tracking, in contact with the screen and tracking, an in contact with the screen with the button pressed for dragging. Of course you can lift a mouse off the table, but this simply leaves the cursor on the screen where it is, in contrast if you lift your finger off the screen and then down somewhere else, the ‘pointer’ jumps to the new position.
In fact this particular difference is not too much of a problem. Most applications are designed for the mouse, and the lightpen had more capability than the mouse. So, unless the programmer did not deal well with sudden jumps in mouse position, all was well.
More problematic is that many touch screens, both stylus-based and finger-based, only operate in states 0 and 2. When you are not in contact with the screen there is no detection of location, and when you are in contact it is treated as a ‘selection’. There is an argument that state 1 is only needed because the mouse is an indirect device, moved on the tabletop to affect the screen. You don’t need state 1 with the stylus as you can see where it is on the screen. However, state 1 allows pixel-precision positioning of the mouse cursor, whereas with touch-based interfaces fine positioning very difficult (the ‘fat finger’ problem).
There are ways round this, for example, the iPhone adds an extra layer of interaction: if you touch and move it is treated as state 2 dragging, but if you touch and stay still a small magnified view is shown and movement treated as state 1 with the lifting of the finger acting as selection. However, this allows no way to perform a drag action with pixel-level accuracy for the start and end points, so the text applications all have text selections where the end points can be individually dragged.
However, even when devices are capable of the same things, it does not mean they are equivalent to use. Buxton showed this by comparing children’s drawing toys. Like the early CAD workstations, Etch-a-Sketch has separate knobs for horizontal and vertical movement (see Fig. [[fg:etch-a-sketch]]). This makes it really easy to draw the sides of a rectangle, but hard to draw a smooth diagonal line or circle. The mouse of course has the opposite characteristics, fine for any movement, but without the precision of separate x–y controls. Indeed, in a drawing program you may have found yourself sizing boxes by separately dragging the side and top edges (x or y one at a time) rather than dragging the corner, or used text entry boxes to give precise x and y coordinates.
Even the mouse itself differs considerably between brands. The earliest mice often had the buttons on the end whereas most modern mice have buttons on the top (or in the case of the Apple mouse the whole top surface). [hilite]Having buttons on the top makes long drags difficult or impossible, and with a very large display it becomes necessary to move things around by several smaller moves.[/hilite] [comment]Needs a bit more explanation[/comment]
These differences between devices affect performance. In Chapter [[ch:body-and-mind]], we described Fitts’ Law, which predicts how long it takes to make positioning movements with a pointing device.
time = A + B log ( distance / size )
Whilst any single device tends to follow the logarithmic law when comparing different distances and sizes of target, when comparing between devices the constants A and B differ markedly between devices, and this is used as the basis of the ISO 9241 standard to measure ‘non-keyboard’ devices.
However, the differences are even more important when considering specialised tasks. Many computer artists prefer to use a tablet and pen rather than mouse as the combination of the angle you hold the pen and the fact that there is a direct mapping between location on the tablet and screen location makes it seem more ‘natural’, more like a real pencil or brush. [comment] I thinks there are also some things we could say here about the physicality of the mouse: the fact that the mouse ball (or laser) is placed at the centre of the mouse rather than at the end where the finger control happens probably has a strong bearing [/comment] Similarly you can buy a small steering wheel into which a Wii-mote unit fits. This adds nothing to the capabilities of the Wii-mote, it is simply a holder, but makes the device feel much more intuitive when playing driving games.
Of course, not least when comparing different ‘equivalent’ devices, is the fact that some have different ergonomic characteristics. The reason for the use of the additional mouse with the laptop is to alleviate the muscle and joint strain of using the laptop trackpad.
Specialisation — information appliancesis information appliances still the appropriate term? Computer embedded devices?
While it is possible to regard computers as pretty much similar, the same cannot be said for kitchen appliances. The controls for a cooker, washing machine, dishwasher, microwave or food mixer all look different with specialised dials and buttons for particular functions. Because the computer is ‘general purpose’ it has a one-size-fits-all collection of devices (mouse, keyboard, screen), whereas more specialised consumer goods have their controls designed specifically for purpose.
There is still a level of similarity with individual controls on each device with recognisable buttons and dials. However, the number of each and the way in which they are laid out are individual and special. The computer uses a single physical device (the mouse/trackpad) and then makes it serve many purposes, often by showing virtual ‘buttons’ on the screen. In contrast, consumer appliances [hilite] tend to have one button per function.[/hilite] [comment]White goods maybe? I can think of quite a few consumer appliances with multifunction keys [/comment] Furthermore, even the dials differ: some can be moved continuously within some range, others have a number of ‘clicks’ relating to the number of options they control.photos of different dials
While the dials and buttons are generic, there are sometimes very special controls designed for the particular purpose such as the steering wheel on a car. Some of these have a relatively arbitrary connection to the function they perform; for example, the gear stick on a car has its particular form due to the mechanics of the gear box, but for the ordinary driver it is just the way it is. [comment] there is a link here to some stuff I wrote about physicality and car controls towards the end of Chapter 20 [/comment] However, other controls are intimately connected with the act of using the device for example the food mixer that turns on when you press down on it, or the digital scales that automatically turn on when you step on them. The latter can be particularly intuitive to the extent that the user may not even think “I’m now turning this on”; it just happens at the right moment.ran out of steam but feel there is more to say something about all going digital!! and about more clear hybrid devices such as iPod
What does it do?
When a device has just an on/off button it is easy to know what to do, but when faced by dozens of buttons on a remote control it may not be so obvious. Equally hard may be cases when it is not obvious that something can be controlled at all.
[[@fg:pepper-grinder]] Pepper grinder
Figure [[fg:pepper-grinder]] shows a pepper grinder. A hapless guest might spend some time trying to work out what to twist to get it to grind. In fact it is an electric grinder and the metal disk on top is not decoration, but a switch that turns on the motorized grinder and a small light to boot1. This highlights that there are at least three things you need to know before you can even attempt to use a device:
- know what the device is capable of doing, its functionality
- know what controls are available to you
- know the mapping between the controls and the functions
The pepper grinder fails on both (1) and (2)!
The first of these seems must fundamental, but in fact often if you can grasp (2) it is possible to work out (1) and (3) through experimentation; albeit possibly embarrassing or causing damage on the way — think of the consternation of ‘Q’ as James Bond playfully presses every button on the missile-packed sport car.
In fact we have encountered these issues already in the form of affordances. Just as the rock of a certain height affords sitting, so also the pepper pot affords grinding pepper; however, it may lack the perceptual affordance that says what to do to achieve it.
One might think these are only issues for the newcomer to the device such as the houseguest. However, it can affect even frequent users. One of the authors was giving a talk about physicality and using the light switches in the room in order to illustrate a point. They were the kind that you press and the light goes on, then press again to make it go off. To illustrate that the action of pressing the light is in fact two parts press in and release, the author pushed the switch in and held it for a few moments while talking. To his surprise and that of the room the lights began to dim. What had appeared to be a simple on/off switch was in fact a dimmer. What was particularly surprising was that none of the people at the talk, several of who taught regularly in the room, knew of the extra functionality.
Note that a more traditional dimmer switch would use a rotating knob to control the internal electronics directly. The knob suggests that it controls something variable, and thus would make it more likely that the users of the room would have discovered the dimmer functionality for themselves.
These problems are particularly common for those flat buttons where a thin plastic sheet covers a contact below, or which operate by touch alone. These are easy to clean so have advantages in a public areas as well as parts of the home such as the kitchen where hands may be dirty. However, it is common to see people pressing the sign beside the button instead of the button itself as both are flat, plastic, and covered by many previous people’s finger marks. This is not helped when the notice says “press here”!
We will return to aspects of (2) later in the chapter[comment]is this too after away from the description?[/comment], but for now let’s assume you have some idea of what the device does, and can see what controls are available. You are then faced with problem (3): “what does what”, often called ‘mapping’.
This mapping between physical controls and functions has been a point of interest since the earliest days of human computer interaction research. One of Don Norman’s examples is the electric cooker. There are four dials and four rings, but which dial controls, which ring? Often tbe controls are placed in a line on the front of the cooker or above on a separate panel. The two dials on the left control the left two rings, but what about back and front? Some cookers instead place the controls alongside the rings in a line form back to front — now it is not even obvious, which is left and which right. Of course, the dials each have a little image beside intended to make this clear, but even if you can work out what they mean do you manage to do this quickly enough we the pan is about to boil over?
[[@fg:cooker-knobs]] Cooker hob knobs — note picture above each to attempt to clarify mapping
Physical placement helps users understand mapping. If things are on or near the item they control then you at least know which device they apply to. You may sometimes get confused as to which remote control is which, but are unlikely to go to the controls on front of the TV when you mean to turn on the HiFi. There are limits to this. The remote control may confuse you, but it saves you getting out of the chair. Also with the cooker, one could imagine having a separate dial for each ring placed right next to the ring, but of course you would burn yourself whenever you tried to use them.
Where the things that are being controlled have some form of physical layout, then reflecting this in the controls themselves can help; for example, if there is a line of lights in a room, organising the light switches to be in the same order (see Fig. [[fg:power-sockets]]). With the cooker, we could imagine laying out the dials in a square reflecting the layout on the hotplate, but of course this would take more space than laying them in a line. [comment] this relates in some ways to the Roombooker case study in Chapter 20[/comment]
[[@fg:power-sockets]]. (i) power socket switches have clear physical correspondence, but (ii) what do these light switches do?
For more abstract functions, such as the time and power settings of a microwave oven, or channel and volume selection in a TV remote, there is no direct physical correspondence. However, physical appearance layout can still help users to establish a mapping. Look at the microwave oven controls in Figure [[fg:microwave]]. Related controls have similar appearance and are grouped together. [comment] I think I have some material on a remote control design which may be helpful here [/comment]
[[@fg:microwave]] microwave oven control panel
There are also metaphorical positions associated with some concepts. Up, loud, large and forward are ‘positive’, so on a TV remote where there are arrow buttons for volume control, we expect the upward pointing arrow to make the sound louder, and for the channel selection up increase the channel number and down decreases it. [comment]this relates to some stuff I’ve written in the past relating to joysticks and mapping [/comment]
Left and right are somewhat more complex. In a dextra-oriented society right is usually the ‘positive’ direction, but this is made more complicated by reading order. In left-to-right languages such as European languages, the two agree, and in particular notices, images and controls that need to be read or operated in a sequence should flow left-to-right, but where the language flows right-to-left sequences also flow in this direction.
Interestingly the top-to-bottom reading order of English and other languages also causes a conflict for temporal ordering, is forward in time up or down? You will find examples of both being used for information display, but where you want someone to use a sequence of controls in a particular sequence, left-to-right and top-to-bottom reading order wins (for English and European languages).
The observant reader may notice another positioning conflict in the microwave controls in Figure [[fg:microwave]]. Along the bottom of the panel are three buttons, which add 10 minutes, 1 minute and 10 seconds respectively to the total cooking time — that is bigger to the left, the opposite to the general right=positive=bigger rule. However, this is reflecting the order that the digits are written in the display — when you write numbers it is the digit corresponding to the biggest unit that comes first.story of scroll arrows here or virtual chapter
So, as you design physical correspondences you need to be aware that there may be several potential correspondences and the one the user takes may not be the one you intend. Where there is potential for confusion you can either:
(a) attempt to remove one of the ambiguous correspondences by repositioning controls, for example, putting the time controls vertically rather than horizontally.
(b) increase the physical connection of one so that it dominates, for example, placing the time controls directly below the display making the correspondence between digits on the display and the button order more obvious.
(c) add additional labels, or other decoration to disambiguate, for example, make the ‘mins’ and ‘secs’ a little more salient, although this shift from physical to symbolic may fail when users are stressed … just like the cooker control labels when the pan is boiling over.
(d) perform a user study to see whether one of the physical correspondences is the natural interpretation; in fact the time controls on the microwave appear to work without any errors, so in this case the digit order
Even if physical correspondences have not been explicitly designed, users will often see them there. Recall the story of the fire alarm in Chapter [[ch:interacting-with-physical-objects]]. In that case the fire alarm button was next to the door. This is a sensible position for a fire alarm, but it is where you normally expect a light switch to be.
The examples so far have all been for very ‘ordinary’ interfaces; however, the same issues arise when designing more innovative interactions. The “expected, sensed, and desired” framework was developed as part of the Equator project in order to analyse and generate mappings in novel devices [[BS05]]. Figure [[fg:augurscope]] shows one device analysed, the Augurscope, which was used to view 3D virtual worlds. The user either pushes the small trolley or holds the detachable screen in their hands while walking around Nottingham Castle. When they point the screen at a location they see a reconstructed 3D view of what was there in the past.
[[@fg:augurscope]] The Augurscope II, stand-mounted and hand-held (from [[BS05]])
The framework considers three things:
- expected — what actions is the user likely to perform on the device. For example, the device may be pointed in different directions, or used while walking around.
- sensed — what manipulations of the device can the sensors embedded in the device detect. The Augurscope was equipped with a GPS and an electronic compass (as is now common in mobile phones and other hand-held devices, but not at the time).
- desired — what functionality is wanted for the device. For example, the ability to look at the scene in different directions.
Having identified elements in these three categories one can use it to look for potential matches, mismatches and opportunities. Figure [[fg:sensed-desired]] shows the space of possible overlaps and gaps between these categories.
In the centre are the things for which the device is already catering well: desired functionality for which there is an expected user action, which can be sensed using the existing sensors. The simple act if turning the device around fits in this area: if the user is looking in one direction and can see the historic reconstruction in that direction, it is natural to turn the device to face other ways, this can be sensed using the compass and so the desired functionality of exploring the 3D reconstruction is achieved.
Other parts of the framework suggest potential problems. On the right are things that are desired and sensed but not expected. The seminar room light switch mentioned earlier demonstrates this. Part of desired functionality was to dim lights and this was mapped onto holding the switch down; however this was not an expected action for the user and so it remained undiscovered until the author’s lecture. [comment] is there an opportunity here (or somewhere close by to talk a bit about the light switches in our new building which adopt the physical form of a standard light switch but which are computer mediated and near impossible to operate…[/comment]
However, the framework can be used as an inspiration or to identify opportunities for design. At the bottom is desired functionality that is currently not supported at all in the device, whilst at the top are actions that are expected and sensed but for which there is no currently desired functionality: can the latter be used to offer ways of achieving the desired functionality? One of the desired, but not supported features of the Augurscope was to explore caves beneath the castle grounds. The Augurscope permitted ‘flying’ above the ground (to see a bird’s eye view of an area) by tilting the Augurscope screen downwards, but while it may be expected that users might make the opposite upward movement and this could be sensed with the compass (top area), there was no desired function mapped to this. This suggests a potential way in which the unsupported cave-viewing functionality be mapped onto the upward tilt, sort of dropping into the ground, and then reversed by a downward tilt which would ‘fly’ back up to the surface.
[[@fg:sensed-desired]] Overlaps and gaps identified by the framework (from [[BS05]])
Another key issue since the earliest days of human–computer interaction has been feedback — letting the user know what has happened. When you type on proper keyboard you can hear that you have typed a key as you hear the sound of it — the sound is natural feedback. However, in a noisy street you may not be able to hear this click of the key and so cash machines often make and additional loud beep for each key that you press. Without this you may be unsure that you pressed the key properly and so press it again. In some circumstances this may be fine as the second press may not do anything, or at least not be damaging, but in others pressing a key twice may not be what is wanted at all.
In “The Design of Everyday Things” [[No98]] Don Norman suggested that human action can be seen in terms of a seven stage cycle. Four of the stages relate to the execution of an action: deciding what to do and doing it:
1. establishing a goal or desired state of the world (e.g. have document secure)
2. forming an intention to act (e.g. save the document)
3. producing a sequence of actions (e.g. move mouse to ‘save’ button’ then click)
4. executing the action (e.g. actually move hand and fingers)
These stages can be used to diagnose different kinds of problems. In particular James Reason [[Re90]] distinguished two kinds of human error: 1) mistakes where the user is trying to do the wrong thing and 2) slips where the user is trying to do the right thing, but in some way fails to achieve it. The former are effectively failures in stage 2 whereas the latter are failures of stages 3 and 4.
However, it is the second part of the cycle which is of interest here, the three stages of evaluating an action: working out whether it did what was intended:
5. perceiving the state of the world (e.g. see alert box “file already exists”)
6. interpreting the perceived state (e.g. understanding the words)
7. evaluating the resulting situation with respect to the goals and intentions (e.g. deciding that the document needs to be stored in a different place)
All these stages critically depend on feedback, having sufficient information available from the world (including a computer system or electronic device) to work out whether the right thing happened. As in the execution side, failures can happen at different points. A small red light on the car dashboard may not be noticed at all (failure in stage 5), or if noticed the driver may think it means the petrol is nearly empty whereas it in fact means the engine is seriously malfunctioning (failure in stage 6).
Physical objects often create feedback naturally because of what they are: you lift a mug and you can feel its weight as it lifts off the table, drop it and you hear the crash as it hits the floor. However, with electronic and hybrid devices it is often necessary to add feedback explicitly for digital effects. For example, you do not hear the sound of an email squashing its way through the network cable, but software may add a sound effect.
Imagine you are about to make a call on a mobile phone and start to enter the number. You will experience several different forms of feedback:
- you feel the key being pressed
- you hear a simulated key click sound
- the number appears on the screen
Note that the first of these is connected purely with the physical device you still feel it even of the battery is removed; the second is a sort of simulated real sound, trying to be as if the physical keys made a noise; and the last is purely digital.
Figure [[fg:feedback-loops]] shows some of these feedback loops. Unless the user is implanted with a brain-reading device, all interactions with the machine start with some physical action (a). This could include making sounds, but here we will focus on bodily actions such as turning a knob, pressing a button, dragging a mouse. In many cases this physical action will have an effect on the device: the mouse button goes down, or the knob rotates and this gives rise to the most direct physical feedback loop (A) where you feel the movement (c) or see the effect on the physical device (b).
[[@fg:feedback-loops]] Multiple feedback loops
In order for there to be any digital effect on the underlying logical system the changes effected on the device through the user’s physical actions must be sensed (i). For example, a key press causes an electrical connection detected by the keyboard controller. This may give rise to a very immediate feedback associated with the device; for example, a simulated key click or an indicator light on an on/off switch (ii). In some cases this immediate loop (B) may be indistinguishable from actual physical feedback from the device (e.g. force feedback as in the BMW iDrive discussed in chapter [xref name=”ch:sota” /]); in other cases, such as the on/off indicator light, it is clearly not a physical effect, but still proximity in space and immediacy of effect may make it feel like part of the device.
Where the user is not aware of the difference between the feedback intrinsic to the physical device and simulated feedback, we may regard this aspect of loop (B) as part of `the device’ and indistinguishable from (A). However, one has to be careful that this really is both instantaneous and reliable. For example, one of the authors often mistypes on his multi-tap mobile phone hitting four instead of three taps for letters such as `c’ or ‘i’. After some experimentation it became obvious this was because there was a short delay (a fraction of a second) between pressing a key and the simulated keyclick. The delayed aural feedback was clearly more salient than the felt physical feedback and so interfered with the typing; effectively counting clicks rather than presses. Switching the phone to silent significantly reduced typing errors! [comment] linked to McGurk Effect? [/comment]
The sensed input (i) will also cause internal effects on the logical system, changing internal state of logical objects; for a GUI interface this may be changed text, for an MP3 player a new track or increased volume. This change to the logical state then often causes a virtual effect (iii) on a visual or audible display; for example an LCD showing the track number (iii). When the user perceives these changes (d) we get a semantic feedback loop (C). In direct manipulation systems the aim is to make this loop so rapid that it feels just like a physical action on the virtual objects.
Finally, some systems affect the physical environment in more radical ways than changing screen content. For example, a washing machine starts to fill with water, or a light goes on. In addition there may be unintended physical feedback, for example, a disk starting up. These physical effects (iv) may then be perceived by the user (e) giving additional semantic feedback and so setting up a fourth feedback loop (D).
Frogger – feedforward
The Device Unplugged
When something stops working and is safe to do so, you might give it to a child as a plaything. Or maybe you are waiting for a bus and have a restless baby, you might give your phone to the baby to play with (but after turning it off so that she does not accidentally call the police!).
With an iPhone the baby could perhaps use it as a mirror, feel the weight of it, look at its shininess. With a more traditional phone, the baby might press buttons, perhaps open and close the lid (Fig. [[fg:phone-states]]).
[[@fg:phone-states]] (i) a nice mirror (ii) buttons to push and (iii) slide the phone in and out
When we think of a device such as a phone, we quite rightly treat it as a whole “I press this button and it dials a number”. However, as we started to see at the end of the last section and the playing baby demonstrates, the physical device has interaction potential even when unplugged, disconnected from its power and digital functionality.
Think of the phone without its battery, or tearing a central heating control off the wall and snipping its wires. What can you do with them? What do they suggest to you?
As the baby would discover, the iPhone on the left in Figure [[fg:feedback-loops]] has very little interaction potential without its power: there is one button at the bottom of the screen, and few small buttons on its edge, all artfully placed to not obscure the clean lines of the phone. In contrast, the phone on the right has a variety of actions that can be performed, pressing buttons, sliding the keyboard in and out.
In the remainder of this chapter we will work through a number of examples of devices showing different kinds of interaction potential when unplugged, and discuss how these physical actions map onto its digital functionality. the ‘unplugged’ behaviour will in most cases be illustrated using a Physigram, a diagrammatic way of formally describing physical behaviour [[DG09,DG17]]
One of the simplest examples of a physical device is a simple on/off light switch. In this case the switch has exactly two states (up and down) and pressing the switch changes the state (Figure [[fg:light-switch-2-states]]). Note that even this simple device has interaction potential, you can do things with it.
[[@fg:light-switch-2-states]] light switch: two states — visible even when the light bulb is broken
Actually even this is not that simple as the kind of press you give the switch depends on whether it is up and you want to press it down or down and you want to press it up. For most switches you will not even be aware of this difference because it is obvious which way to press the switch … it is obvious because the current state of the switch is immediately visible.
Note that the switch has a perceivable up/down state whether or not it is actually connected to a light and whether or not the light works — it has exposed state.
The phone in Figure [[fg:phone-states]] also has some exposed sate in that you can see whether it is open or closed, but the buttons are not the kind that stay down. The iPhone has no exposed state at all. Here are some more exposed state devices (Figure [[fg:exposed-state]]).
The sockets are similar to the light switch except here the red colour on the top of the switch is also designed to give some indication of the mapping; that is feedforward. The washing machine control dial is more complex, but again it is immediately obvious by looking at the dial that it has many potential states. Like the power switch it also tries to provide feedforward through words and symbols around the dial. We will return to the washing machine dial later as it has a particularly interesting story to tell.
The central heating control is more like the mobile phone as it has a flap that moves up and down. Like the light switch, this means there are two very visually and tangibly obvious states “open and closed”. However, this is a very particular form of exposed state as its effect is to hide or reveal other controls. In this case the purpose is to hide complexity, but it may also be used to protect against unintended actions — when the phone is closed it is impossible to accidentally dial a number. In the case of the phone there is of course yet another purpose, which is to change the form-factor, when closed the phone is smaller to fit in your pocket or handbag.
In contrast to these exposed state devices, consider this volume control on a CD player (Fig. [[fg:CD-hidden-state]]). It has clear action potential, perceptual affordance: you can see that it sticks out, is round, it invites you to pull, push and, in particular given its roundness, twist it. However, remember the power is unplugged and so there is no sound (or imagine twisting it during a moment of silence between movements). There is no indication after you have twisted it of how far. The washing machine and cooker knobs were styled and decorated so there was an obvious “I am pointing this way” direction, but here nothing. In fact, inside things have changed, but on the outside, there is nothing detectable; it has hidden state.
Another common example of hidden state are bounce-back buttons, such as often found for the on/off switches of computers. Consider the TV and dishwasher button in Fig. [[fg:dishwasher]]. Superficially they look similar, however when you interact with them, their behaviours differ markedly. With the dishwasher button you press it and it stays in (in fact, the ‘on’ position when the power is on — see the little red light, the power was actually on when the photo was taken!); that is it has exposed state. In contrast you press the TV button in and as soon as you let go the button bounces right back out again. Of course the TV turns on or off as you do this, but the button on its own tells you nothing; that is hidden state.
If this seems a minor thing, maybe you have had the experience when the TV screen is blank, but you don’t know whether this is that because it is off, because it is in standby, because you have turned it on and it is warming up or because the DVD player connected to it is off? In fact, if you have learnt it is often possible to see because small red LEDs are added — in this case you can see the LED next to the button labelled ‘STAND BY’. However, in reality, do you really look at all those little red lights or do you simply press a few buttons at random on the different boxes until something happens?
Maybe you have even lost data from your computer because you accidentally turned it off when it was in fact just sleeping? On many computers, both desktop and laptops, there is a single on/off button (Fig. [[fg:computer-on-off]]). To turn it on you press it, to turn it off you press it, but it simply sits there looking the same. You open the laptop or look at the and monitor (which itself maybe because the computer is asleep or because the monitor is). Thinking it is off you press the power button often to then, too late, hear the little whirr as the disk started to spin as it woke form sleep, only to hear the dull thud as it turns off and starts to reboot. What was onscreen before it went to sleep? Did you save the draft of that chapter on annoying hidden state buttons?
As you contemplate several hours lost work, you can take comfort in the fact that the designer has often foreseen the potential problem. In the photo above, you can see that this computer button, like the TV button earlier, has a small light so that you can see that the power is on, in this case a small green (unlabelled) LED. If you had been observant, if you had realised this is an indicator meaning “turned on” rather one meaning “connected to the power”, if you hadn’t got confused in the moment, then you could have worked out it was on and not lost all that work — small comfort indeed.
Now there are good reasons for using a bounce-back switch, which we will discuss in detail later, one of which is when the computer can also turn itself off in software. However, these bounce-back buttons are often found on computers when this is not the case (indeed the one photographed in Fig. [[fg:computer-on-off]] does not have a software ‘off’) and an old fashioned up/down power switch might be more appropriate, or one where, like the dishwasher button, it stays depressed when in the ‘on’ position. Even where the software can switch the power off, why not simply have an ‘on’ button and then an additional ’emergency off’ button for the cases when the software is not shutting down as it should? This could be small and recessed so it is not accidentally pressed, rather like a wristwatch button for setting the time.
Sometimes the reason for not doing this is simply lack of insight, and sometimes plain economics — the cents or pence it cost to add an extra button are worth more to the manufacturer than your lost work! However, often it is aesthetics: your lost work is weighed against the flawless smooth casing with its single iconic button. And if you think the designer made a poor choice, what do you think of when you buy a new computer? It is a brave designer who is willing to focus on the long-term benefit of users that improves their life, rather than the immediate appearance that makes them buy the product. Are you brave enough? Or maybe it is possible top achieve aesthetics and safety, certainly the additional small ’emergency off’ button could be located slightly out of sight (although not so hidden the user can’t find it!), or maybe made and essential part of the aesthetic of the device.
Tangible transitions and tension states
When you twist the CD knob shown in Figure [[fg:CD-hidden-state]] it is heavy to turn, giving it a feel of quality however apart from that there is no sense of how far you have turned it. However, not all knobs and dials are like this.
Figure [[fg:photo-viewer]] shows three experimental prototypes that were produced for a photo viewer. All have an area where a small screen would go and all have a rotary control. The one on the left has a very obvious retro dial, the middle a more discrete dial, and the one on the right an iPod-like touch surface. In all the prototypes rotating the dial enables the user to scroll between different menu options, although never more than seven at any level.
While they all use rotary controls, they feel very different in use. On the right the touch surface offers no resistance at all, your finger goes round, but without the display you cannot tell there is anything happening. In fact, it is perhaps only because one is used to devices like this that one would even try to stroke it — the cultural affordance of the iPod generation! In contrast the more clunky looking prototype on the left has a far richer repertoire of tangible feedback. It already has exposed state as you can tell what direction it is pointing, but in addition has end stops so that you can feel when it has got to one extreme or other of the menu. Also, the mechanics of the mechanism mean that there is slight resistance as you move between its seven positions; that is it has tangible transitions between states.
Tangible transitions are particularly important when considering accessibility for the hearing impaired or for occasions when you cannot look at the screen, such as when driving. The left hand device has both end-stops and tangible transitions; this means that once someone has learnt some of the menu layouts, the device can be used without looking at the screen at all. Even when you can see, the tangible transitions give additional feedback and the resistance between the positions makes it difficult to accidentally select the wrong option.
The device in the middle has a form of tangible transition, as there is a very slight sensation as one moves between positions, but it has no end stops and there is no resistance before it moves to a new position. The lack of resistance makes errors more likely and the lack of end stops means it is harder to orient oneself except by looking at the screen, however at least it is possible to tell how many steps one has taken.
It is not only knobs that can have tangible transitions. The light and power switches discussed earlier not only have a visible state, but there is definite resistance as you push the switch down, it gives a little, and then the sudden movement as it flicks down. If you release the pressure of your finger before it flicks down, it simply bounces back to where it started. In a sense the device has at least four states; as well as the obvious up/down as there are also part up and part down states as one pushes the switch, although only the up/down states are stable when you release your finger pressure.
Bounce-back buttons, such as the computer power button in Figure [[fg:light-switch-part-way]], can similarly be seen as actually having two states out and in. Only the out state is stable, but while you press with your finger it remains in a pressed-in state. This is a tension state, one where you have to maintain a continuous effort in order maintain it. In the case of the computer power, the tension is never maintained, you just press and release. However, tension states are often used as part of interaction, for example, dragging with a mouse.
Keeping your hand, or other muscles in tension can cause fatigue if maintained for a long period, and also affects accuracy and timing. Indeed, Fitts’ Law measurements show measurable differences in performance between ordinary mouse movement and dragging [[MS91]]. However, the advantage of a tension state is that you are very aware that you are in the middle of doing something. When typing it is possible to break part-way through a sentence and leave it incomplete, however it is impossible to go away part-way through dragging the mouse, you have to release the mouse button and end the drag. This can be particularly important in safety critical situations, such as the use of the ‘dead mans handle’ in trains.
It is summer holidays, and you are driving down a small country lane in Cornwall where the sides of the lanes are high earth banks and the lanes themselves winding, narrow and with no room for cars to pass one another. The car is packed full with suitcases and tents, spades and swimming costumes, so you cannot even see out the back window. Suddenly, round the bend ahead another car appears coming towards you. You both stop, one of you must go back. There is a relatively straight part of the lane behind you with nowhere to pass, but you do recall you passed a gateway just before the last bend, so you shift the gear into reverse and begin to edge backwards, with only your wing mirrors to see where you are going.
At first you drive very slowly, everything is back-to-front and unless you think very hard you turn the wheel in the wrong direction. However, after a bit you find yourself confidently driving backwards at a fair speed down the straight lane behind. Every so often you suddenly ‘lose’ it, end up getting too close to one of the banks and have to stop, and, as if from the beginning, work out which way to turn the wheel, but each time you quickly get back into the flow.
Even if you are a very experienced driver and find reversing is no longer difficult, maybe you have tried to reverse a trailer or caravan and had a similar experience.
It is reasonable that this is difficult if you are not used to reversing a car long distances, but what is remarkable are the periods in between when it becomes easy. It is not that you have learnt the right thing to do, as you find that when you get out of the flow, you have no better idea which way to turn the wheel than when you started.
The reason for these periods when it becomes easy is that the steering wheel exploits very basic human responses — the natural inverse. When you draw a line on paper and decide it is in the wrong place, you need to find the eraser and rub it out. It is not hard, something you will learn to do without thinking, but is something you have had to learn. If however you are trying to put the pencil inside a desk-tidy and move your hand a little to far to the right, you automatically move it slightly to the left. In the world there are natural opposites: up/down closer/further, left/right, and our body mirrors these with muscles and limbs hard-wired to exploit these.
Rodolfo Llinás’ work showed that some of this is very low-level indeed, with sets of mutually inhibitory neuron’s that allow pairs of opposing muscles to be connected [[Li02, p.45]]. Although higher-level brain functions determine which pairs operate, some of the actual control even happens in the spinal column as is evident in the headless chicken that still runs around the farmyard. These paired muscle groups allow rhythmic movement such as walking, and also the isometric balancing of one muscle group against another that is needed to maintain a static position, such as holding a mug in mid air.maybe drawing of neurons from [[Li02]]
When you first start to drive in reverse, you have to think to yourself, “if I want the car to move to the left which direction do I need to turn the wheel”. However, when you have started to move wheel and discover it is going in the wrong direction, or you are about to overshoot and go to far, you do not have to think again, but instinctively move it in the opposite direction. The natural inverse takes over and you do the opposite of what you were doing before.
Unplugged devices often have buttons, knobs and other controls that have natural inverse actions: twist left/twist right, push/pull. The minidisk controller in Figure [[fg:minidisk]] was intended to be clipped on to clothing while the user was walking or running. Given it will be used eyes-free it is particularly important that the physical format helps make it easy to use. The device has two different kinds of control and both of them exhibit natural inverse.
On the end of the device is a knob. Twisting the knob in one direction moves on to the next track, twisting it in the other direction moves it back a track — natural inverse. The knob can also be pulled out and this changes its function: twisting it one way increases the volume, twisting it in the opposite direction turns the volume down. Note also that this is a tension state, it is obvious whether you are changing track or changing volume, not just from the immediate aural feedback, but also because the knob wants to spring back into place, to adjust the volume requires continuous tension.
Along the side of the device are a number of spring-back sliders; they can be pushed forward or backward along the device. Each slider controls a different function, but all of them use the same principle. There is an ordered list of options for each setting; pushing the slider moves between the relevant option settings one way through the list, pulling it back moves it the other way. Note that the natural inverse reduces the impact of mistakes. If you choose the wrong option and change it, you instinctively move the slider in the opposite direction and restore the setting.
Using the natural inverse can obviate the need for explicit ‘undo’ operations, and can make a control usable even when you don’t know what it does. The phone in Figure [[fg:old-phone]] belonged to one of the authors for some years. On the top left-hand side of the phone is a small slider. This slider did different things in different modes: when in a call it adjusted volume, when in the address book it scrolled through the names. The author never knew exactly what it did, yet used it extensively. This was because it always respected the natural inverse property and so he could use it without fear; if it didn’t do what he expected he just did the opposite movement and carried on.
Why driving backwards is hard: expectation, magnification and control theory
Actually, when you are driving using mirrors things are in a sense not back-to-front. If as you look in the mirror you see the left hand side a little too close to the bank, then you turn the wheel to the right — this is exactly the same as when you are driving forwards. So, if you could somehow ‘switch’ off the knowledge you are driving backwards and pretend that the mirror is really just a very small windscreen things would be easier — however, we do not have the power to fool our own minds that easily!
The other difficult thing is that the wing mirrors are designed so that you can see the whole road behind all in a tiny mirror, whereas the same portion of the road ahead fills the entire windscreen. Using the mirrors is a bit like driving forward looking through the wrong end of a telescope2.
Finally the mechanics of the steering work differently in reverse. Whether driving forward or backward, it is always the front wheels that turn. This makes it (in principle) easier to reverse into a narrow space, but makes the car much harder to drive in a straight line. Also the front wheels of a car have a slight ‘toe-in’, they point slightly together. This has the effect of making the car want to stay in a straight line going forwards, but has the opposite effect when driving backwards.
Chapter [[ch:body-and-mind]] discussed open and closed loop control; these are part of a wider area of mathematics called “Control Theory”. One general principle in control theory is that there is always a trade-off between control and stability. For example a light beach ball is easy to control, you can roll it exactly were you want it, but it is unstable, the slightest breeze and it rolls away. In contrast a large cubic block of concrete is very stable, but boy is it hard to move where you want it. The forward and reverse movements of the car demonstrate different points in this trade-off: going forwards one has a high degree of stability, it keeps on going in a relatively straight line unless you work hard to change direction, whereas in reverse the opposite is true, it is easy to control in the sense that you can manoeuvre into very tight spaces, but it is highly unstable.
Because digital and mechanical systems do not exhibit proportional effort (Chapter [[ch:physicality-of-things]]) it is possible to engineer situations that are at extreme points in this trade-off space. It is also occasionally possible to ‘break’ the trade-off, to have your cake and eat it. The Eurofighter is deliberately designed to be unstable while flying, rather like the car driven in reverse. This allows very rapid movements when required, but makes it unflyable by a human pilot. However, the pilot’s control is augmented by very fast, automated systems that constantly trim the aerofoils to keep the plane flying where it is intended to go. Might this section be better in either the ‘Interacting with Physical Objects’ or ‘Comprehension of Space’ chapters?
images – Eurofighter, sketches of car wheels and steering
[[BS05]] S. Benford, H. Schnädelbach, B. Koleva, R. Anastasi, S. Greenhalgh, T. Rodden, J. Green, A. Ghali, T. Pridmore, B. Gaver, A. Boucher, B. Walker, S. Pennington, A. Schmidt, H. Gellersen & A. Steed. Expected, sensed, and desired: A framework for designing sensing-based interaction (, ACM Transactions on Computer-Human Interaction, TOCHI, Volume 12 Issue 1, March 2005)
[[Bu90]] Buxton, W. (1990). A Three-State Model of Graphical Input. In D. Diaper et al. (Eds), Human-Computer Interaction – INTERACT ’90. Amsterdam: Elsevier Science Publishers B.V. (North-Holland), 449-456. http://www.billbuxton.com/3state.html
[[DG09]] A. Dix, M. Ghazali, S. Gill, J. Hare and D. Ramduny-Ellis (2009). Physigrams: Modelling Devices for Natural Interaction. Formal Aspects of Computing , Springer, 21(6):613-641, doi:10.1007/s00165-008-0099-y http://alandix.com/academic/papers/FAC-physical-2009/
[[DG17]] Alan Dix and Masitah Ghazali (2017). Physigrams: Modelling Physical Device Characteristics Interaction. Chapter 9 in The Handbook of Formal Methods in Human-Computer Interaction, Springer, pp.247–271. DOI: 10.1007/978-3-319-51838-1_9
[[Li02]] Rodolfo Llinás (2002). I of the Vortex: From Neurons to Self. MIT Press.
[[MS91]] MacKenzie, I. S., Sellen, A., & Buxton, W. (1991). A comparison of input devices in elemental pointing and dragging tasks. Proceedings of the CHI `91 Conference on Human Factors in Computing Systems, pp. 161-166. New York: ACM.
[[Ne68]] Newman, W. M. 1968. A system for interactive graphical programming. In Proceedings of the April 30–May 2, 1968, Spring Joint Computer Conference (Atlantic City, New Jersey, April 30 – May 02, 1968). AFIPS ’68 (Spring). ACM, New York, NY, 47-54. DOI= http://doi.acm.org/10.1145/1468075.1468083
[[No98]] Donald A. Norman, The Design of Everyday Things. MIT Press, 1998, ISBN-10: 0-262-64037-6
[[Re90]] Reason, James (1990): Human Error. New York, NY, Cambridge University Press
[[Wa76]] Wallace, V. L. 1976. The semantics of graphic input devices. SIGGRAPH Comput. Graph. 10, 1 (May. 1976), 61-65. DOI= http://doi.acm.org/10.1145/957197.804734
[[WM09]] Jacob O. Wobbrock, Meredith Ringel Morris, and Andrew D. Wilson. 2009. User-defined gestures for surface computing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’09). ACM, New York, NY, USA, 1083-1092. DOI: https://doi.org/10.1145/1518701.1518866