Tangibility of the Digital - Die Fühlbarkeit des Digitalen
Johann Habakuk Israel
Physical Interfaces and Tangible Interaction: Instruments for Self-Communication
For a long time introduction into the operation of machines has been based on spoken language. Because we were all trained to interpret words as similarly as possible, those who had not worked with machines of that kind yet could also start to do something with them: they simply just had to learn to use certain words as commands.
Johann Habakuk Israel sees processes in sensorimotor computer controls, which can be superior to linguistic controls. Whereas communication in spoken language is based on an exchange of information, according to Israel, interaction with objects is positioned on an exchange of energy. Whereas communication requires interpretation which also includes misconstructions and thus triggers a lot of "traffic", in object-centred interaction, intuitive behavioural diagrams came to be used. They have an archetypal character and can therefore be less sensitive to misinterpretations and can relieve the attentive consciousness of a surfeit of "traffic".
Johann Habakuk Israel is a computer expert. During the nineties he worked on 3D interactions at the Heinrich-Herzt-Insitut using visual control and subsequently on digital library systems and e-learning platforms. Since 2005 he has been in a postgraduate course as a doctoral candidate in the Man-Machine Systems Centre of the TU Berlin and in the Department of Virtual Product Development at the Fraunhofer IPK (Fraunhofer Institute for Productive Systems and Design). His research is focussed on tangible interaction in virtual environments, analogue-digital hybrid tools and sketching in immersive three-dimensional environments.
Working with physical tools is fundamentally different than communicating with human beings. Whereas energy is transmitted in the first case, in the second case information are transferred, such as spoken language, gestures or body signals. But this sharp distinction can not be applied to interacting with computers and technical systems. What we find here is an unsystematic mixture of both interaction modes. In the following I try to show that it is useful to systematically distinguish between both modes in the design of user-interfaces.
The link between physical access, hybrid physical/virtual representations and digital data space is the specific new in the concept of tangible interfaces. The digital space is turning physically, mentally and literally graspable. As instruments in creative processes of thinking and self communicating, tangible interfaces can make us aware of new action spaces and opportunities.
In the "communication" with computers, we again and again show via mouse clicks what we want to do, talk by means of commands and since recently gesture in front of computer cameras in order to invoke functions. Almost every personal computer has a name, some caress their computers or beat them as punishment for malfunctions or system crashes. These all are interaction patterns borrowed from human-to-human communication. Consequently it is not surprising that computer are regularly described as individuals which do something, think, and talk. Many studies have shown that computer play a social role and thus are seen more as social actors than as neutral tools.1 From their first year on children learn to distinguish between the psychical, active, moving world on the one hand, and the physical, passive, externally moved world on the other hand.2 In human-machine-interaction, this very basic skill seems to be lost. From an critical point of view one could say that the character of human-machine-communication is animistic and is pushing the user into childlike patterns of behavior and thinking.3
Even if Tangible User Interfaces (TUI) are "intelligent objects", interacting with them is - at least in the initial concept of Fitzmaurice, Ishii & Buxton4 and Ishii & Ullmer5 - closer to using physical tools than to human-to-human communication. What is utilized here are no verbal commands or pointing gestures but physical manipulation which requires less cognitive resources and leaves the machine in the role of an inanimate physical object.
Various models of human information processing see linguistic communication with living objects on higher cognitive levels than sensorimotor actions without intended information exchange6. Distinguishing between living and inanimate objects helps people to reduce complexity. Paul Watzlawick7 explains this with the following example: "When you are walking and kick a stone, energy will be transferred from the foot to the stone, the stone will start rolling and will finally stop at a place that is completely predetermined by the amount of energy transferred, the shape and weight of the stone, and its surface characteristics.
Now consider the object being kicked is a dog, the dog could jump and bite you. In this case the relation between the kick and the bite would be fundamentally different because the dog would doubtlessly use its own energy resources and not the energy transferred by the foot kick. What is being transferred here is not energy but information. In other words the kick would be a kind of behavior that communicates something to the dog and the dog responds with an appropriate but different kind of behavior."8 Also as per Piaget bedient sich „der gesunde Menschenverstand des Erwachsenen (...) des Trägheitsprinzips, um zwischen Leben und anorganischer Materie zu unterscheiden (...) Ein physikalischer Körper verfügt nur über Bewegung, die er erhalten hat; ein Lebewesen schafft (für den gesunden Menschenverstand) Bewegung."9 What both point out is that humans divide objects in intelligent (non-deterministic responding) and deterministic objects in order to allocate attention resources. Intelligent objects require more attention and awareness than primitive non-intelligent objects. The interaction with both types of objects differs fundamentally. In the case of physically manipulating passive, deterministic objects, the energy which is transferred during the interaction is not interpreted or understood by the object. Predicting the responses to manipulations is theoretically possible entirely by considering physical laws, which makes it more or less easy. In the case of communicating with intelligent objects, e.g. by means of verbal language, body language (gestures) of by means of third objects, predicting the response is far more difficult, because their inner state and the basis of their "decisions" is never entirely known, the range of possible responses increases dramatically and can even make it unpredictable.
Because the sensory equipment of the human is too slow for controlling fast object manipulations in the physical environment, humans simulate the behavior of deterministic objects internally. Rasmussen10 describes this in the concept of the internal dynamic world model. The simulation allows fast and efficient skill-based manipulation which require only few attention resources and needs not be controlled consciously. The simulation happens continually and becomes consciously only if a contradiction between inner simulation and outer perception occurs (interrupt). In this case humans need to apply rules and knowledge in order to solve the contradiction. In contrast to simulating deterministic objects, for which it is comparatively easy to find invariants11, simulating the interaction with non-deterministic objects on an internal dynamic world model would in many cases lead to an inefficient strategy of trial and error. Instead it requires rule-based behaviour which is typically consciously controlled and thus requires more cognitive resources than skill-based behaviour.
From the comparison of complex communication and physical interaction one can derive for human-machine-interaction, that physical elements and physical interaction metaphors can make an user-interface much easier to use in terms of cognitive load. The more (sub) tasks can be performed entirely physically, the less the cognitive load is12. On the other hand, the more tasks require the generation of (linguistic) symbols in an explicit syntax, the higher the cognitive load is. If the control of the system, the interaction problem13, may be solved by means of physical manipulation, more cognitive resources are left for solving the main task, the overall problem14.
Physical interaction as lowest level in semiotic human-machine-interaction models
Considering semiotic human-machine-interaction models helps to divide physical and verbal/gestural interaction in user-interfaces, and to classify tangible interfaces in the context of other kinds of interfaces. Semiotic interaction models evolved in the early eighties.15 They separate aims and tasks of the user and abstract functions from the concrete interaction-coding in the real systems. In 1986, Nielsen16 developed a model which comprises of seven levels. The (7) goal level on the top of the model describes the goal of the user, from which the (6) actual task is derived (task level). The (5) semantic level defines the functionality of the system, sequences of user actions and system responses. The (4) syntax level defines interaction tokens (words) and how to use them to create semantics. The (3) lexical level describes the structure of these tokens (words), made up from elements from the (2) alphabetic level. The actual exchange of these tokens occurs at the (1) physical level by means of user actions and I/O elements, e.g. displays and input devices.
The interaction-problem17 which is to be solved by the user by means of a concrete technical system is defined by the four lower levels. During man-machine-interaction, energy or information is exchanged on each layer, but with varying valences and bandwidths.
From the perspective of semiotic interaction-models it is clear to see that the same overall problem might be solved with different technical systems in a completely different manner. For example in a command-line interface it might be necessary to press several keys in order to specify and invoke a command, whereas in a graphical environment it could be appropriate to select and drag an icon by means of a mouse.
Compared to command-line and graphical user interfaces, tangible user interfaces show a very high bandwidth of energy exchange ("traffic") on the lowest physical level. The syntax of interaction in tangible interfaces is usually implicitly given by the physical coding of the interface. The user needs not to learn an explicit command-language and -syntax. Action and object, which are usually specified separately, are merged into a single interaction token.18 Solving the interaction-problem requires comparatively few cognitive resources. Because physical manipulation of the interface is frequently repeated and re-encoded, it often happens below the consciousness level and invokes, according to Naumann et al.19, intuitive interaction.
Thus, a design principle for tangible interfaces could be to generate as much traffic as possible at the lowest, physical level, and only little traffic at the higher levels of the interaction problem. An possible approach is to increase the degrees of freedom at the physical level, to generate complex, but few interaction token (words), and to code their syntax implicitly physical instead of explicitly verbal. A simple example: To label a door with "pull" or "push" in order to code the direction for operation generates "traffic" at the syntactical level. Instead, designing the door such that the direction for operation is given by its physical design defines the syntax implicitly already at the lowest physical level. The second solution generates less traffic at the higher levels, uses less linguistic symbols and thus limits the "linguistic noise" during the interaction.
The "wild mixture" of icons, labels, control panels, and input devices in graphical user interfaces is hard to justify. Elements which generate symbols, define syntax or enable physical interaction are mingled together. A consistent inherent logic is missing in the interface. The only instance which give meaning to those interfaces are the human intelligence and his learn ability.
Successful tangible interfaces achieve consistency and understandability by embedding both the interface and the application in the user's environment. The requirement to adapt and to be compatible to the physical environment gives the interface meaning20 which needs not to be communicated linguistically. The syntax is the syntax of the physical world and not the syntax of the verbal world.
From the developers point of view it is much easier to code the affordances of a system verbally than physically. A dialog window which generates lots of linguistic words (linguistic noise) is easy to program with only few lines of code. Coding the same logic physically in an interface requires much more effort by the developer. She must consider how the system is perceived and understood by the user without the help of written language. But if she succeeds, the interface has become really easier and more efficient. A possible tool for achieving this aim are image schemas21, basic perceptual-motor patterns that structure the way humans understand the physical world and also abstract concepts. A basic image schema for example is CONTAINER, which is characterized by an inside and an outside and a boundary between them. In the analysis and design phase image schemas may be used as a meta-language which corresponds to human thought in order to design complex, abstract and at the same time understandable tangible interfaces.
Tangible interfaces as instruments of self-communication
The advantages of tangible interfaces presented so far are particularly related to the relief of the interaction and the physical coding of the application logic, which can for example be achieved by materializing the formerly graphical control elements.
However the evolutionary step initiated by tangible interfaces is the extension of the human imaginative space by the digital space. It becomes graspable via tangible interfaces, users may think and act with, respectively through the physical objects of the interface.22 Following the extended mind theory23, the physical world is an inherent part of cognitive processes and human thought. Due to their immediate, instantaneous and extensive coupling to the real world, tangible interfaces gain a quality which makes them also part of human thought and which expands the awareness of own action spaces.
An example of new thinking and action spaces is creative problem solving and sketching by means of tangible interfaces. Sketching is a process of externalizing mental images and reflecting upon them, a process often described as self-communication.24 The medium determines both the process, the sketching, and the product, the sketch. Changing the specific characteristics of the medium also changes the self-communication process. Hence sketches differ depending on whether they were created by means of pencil, chock or spray. Tangible interfaces afford several new sketching techniques which remodel interactive problem solving and thinking. A typical way is to employ physical tools in order to create various kinds of traces in usually hybrid physical/virtual media. For instance moving a pen may not only create a stroke within the digital space but generate algorithmic patterns or moving structures. If the medium is a virtual three-dimensional space, designers may draw lines directly "into the air" and bring sketch, environment and oneself in a spatial relationship.25 With three-dimensional the number of potential solution increases and with it the thinking space of the designer.
As critical factors of "successful" tangible interfaces Sharlin et al.26 name "successful" spatial mappings between physical and digital space, unified input and out space and to enable for trial-and-error activity. Another key criteria for "successful" tangible interfaces which are supposed to support thinking is the speed with which user movements in (not necessarily visual) digital spaces are transformed into the digital space. This speed must be high enough for using the interfaces synchronous to once own thinking. System response times above 50 milliseconds have a delaying effect onto the coordination of the hand, with response times of 225 milliseconds and above users tend to adjust their movements and effects only asynchronously.27 Compared to entirely graphical interfaces, entirely physical interfaces have the advantage to react without delay (lag) and with infinite frame rate. This makes interaction more effective28 and supports the impression of interacting directly in the physical world instead of interacting with a graphical image. Furthermore, the materiality of physical tools allows the application of fine muscle groups in the hand and of manual and technical skills.29 If materiality is missing, for example with entirely graphical interaction objects, already simple rotary movements may become difficult for the user, which may lead to strange body movements.30
Employing gestures by means of tools, hands or the whole human body is regularily discussed in the context of tangible interaction.31 Thereby, manipulative gestures have to be considered fundamentally different from communicative, semaphoric gestures.32 Whereas the first are body movements accompanied by physical feedback from the manipulated object, the second are, if intended by the sender or not, interpreted by the receiver, thus they are inherently social. Communicative gestures for the control of systems are, due to their (body-) linguistic character, closer related to command-based interfaces than to tangible interfaces. An example is the "put-that-there" interaction-technique, that allows to move graphical objects on the screen by means of deictic hand gestures and spoken commands.33 If gestures are employed as means of self-communication, that is, if body movements as in computer-based sketching are not being interpreted, but via physical metaphors transferred into the digital medium, it is no longer necessary to call these systems gesture based. Pen and paper, hammer and nail, or scraper and clay are no gesture-based tools too. Instead, they enable the illustration and alienation of motion traces. Also, current gesture-recognition systems require over explicit movements from the user in order to achieve stable results. Hence, beside the transfer of social interaction patterns into human-machine-interaction, the users body-language is blunting and being standardized. Tangible interfaces should therefore not employ gestural control but capture body movements and display them in the digital space by means of physical metaphors.
Good tangible interfaces allow us, in contrast to today's scratching on the surfaces of user-interfaces, and in the very meaning of the initial term „graspable user interfaces"34, to grasp into the digital space. This is going to alter the way we think and has already done so. Nevertheless it remains important to domesticate the computer as a tool. Therefore: less gesticulation, less language, less communication with the interface, but more direct physical interaction, movement and self-communication, and room for direct human-to-human communication.