On Human-Centered MultimediaMay 25, 2005
Alejandro Jaimes FXPAL Japan, Fuji Xerox Co., Ltd.
While thinking about the topics that I wanted to cover in this article, my daily encounters with multimedia in the “real” world came to my mind. Using some of these as examples, I will briefly discuss some thoughts on what I consider three key factors in the development of future multimedia systems: (1) the role of culture; (2) integration of sensors; (3) access outside the desktop by a wide range of users.
I will argue that developing multimedia systems requires a human-centered approach. By human-centered I mean an approach in which the user is the starting point and in which social and cultural factors are quantified at multiple levels and incorporated into computational frameworks.
Culture, Deployment, and Access
I live in Japan. When one enters any establishment in Japan, there is an immediate “irashaimase” greeting (welcome!) by store employees. But it is also often automatic: when I enter an elevator, approach an ATM, or a metro ticket vending machine, a sensor activates, and I am greeted by multimedia cartoon characters that speak to me (welcome! going down! all information will be displayed in English!). The characters do not speak like computers—they speak like Japanese sales clerks (high-pitch voices of very specific characteristics). They even bow. The ATM welcomes me before I touch it, the elevator greets me as I enter, and the toilet seat goes up when I open the door of the restroom in a restaurant (another welcome sign).
It is interesting to consider these systems while thinking about multimedia. Although the interfaces are very primitive and some of them are not really multimedia computing systems, there are several important characteristics to consider: (1) they act according to the cultural context in which they are deployed; (2) they integrate different types of sensors for input and communicate through a combination of media; (3) they are deployed outside the desktop and they are meant to be accessed by a diversity of individuals.
Let me expand on these points.
A Human-Centered Approach
Human-centered multimedia systems should be multimodal (inputs and outputs in more than one modality or communication channel). They must also be proactive (understand cultural and social contexts and respond accordingly), and be easily accessible outside the desktop to a wide range of users.
A human-centered approach to multimedia parts from user models that consider how humans understand and interpret multimedia signals (feature, cognitive, and affective levels), and how humans interact naturally (cultural and social context as well as personal factors such as emotion, mood, attitude, and attention).
Inevitably, this means considering some of the work in fields such as psychology, communications research, HCI, and others, and incorporating what is known in those fields in mathematical models that can be used to construct algorithms and computational frameworks that integrate different media. Machine learning integrated with domain knowledge, automatic analysis of social networks, data mining, sensor fusion research, and multi-modal interaction will play a special role. More research into quantifying human-related knowledge is necessary, which means developing new theories (and mathematical models) of multimedia integration at multiple levels.
The Future is Bright
Multimedia computing offers, for the first time in history, real possibilities of human-like interaction with machines. This is very significant because technology has traditionally played a crucial role in development and multimedia can make the difference in the democratization of technology (access to all). That is crucial because computational technology is becoming the gateway to all basic human resources.
We are still far from achieving human-like interactions with machines and most of the world’s population does not have access to technology. A human-centered approach, however, contributes to making interaction more natural and will ultimately make technology more accessible to everyone.
Many technical challenges lie ahead and in some areas progress has been slow. With the cost of hardware continuing to drop and the increase in computational power, however, there have been many recent efforts to use multimedia technology in entirely new ways. One particular area of interest is “new media art.” Many universities around the world are creating new joint art-computer-science programs in which technical researchers/artists create artworks that combine new technical approaches or novel uses of existing technology with artistic concepts. What is interesting about some of these works is that technical novelty is introduced while many of the issues described above are considered: cultural and social context, integration of sensors, migration outside the desktop, and access.
Technical researchers need not venture into the arts to develop human-centered multimedia systems. In fact, in recent years many user-centered multimedia applications have been developed (e.g., smart homes and offices, etc.). However, more efforts are needed and the realization that multimedia research, except in very specific applications, is meaningless if the user is not the starting point.
Figures The toilet seat goes up automatically as the customer opens the door to the restroom. It closes when he closes the door. Many of these toilets have a water jet spray used to wash and massage the buttocks, and warm the toilet seat and play a range of melodies while in use: chirping birds, rushing water, tinkling wind chimes, or traditional Japanese harp, among others. An article I read claimed that more than half of Japanese homes have such electric toilets, a rate higher than personal computers.
It is customary to welcome customers with a loud “irashaimase” (welcome). This can be tiring for the store clerks, so in some places sensors have been installed so that customers are automatically by a recording welcomed as they enter (and thanked as they leave).
Some restaurants have wireless touch screens so customers can order. The screens can be passed around the table just like a menu. The waiters only show up when the food is ready or if they are called, by pressing the waiter icon.
Maybe nowhere in the world more than in Shibuya, a crowded, young area of Tokyo, are pedestrians bombarded with videos, images, and sounds. Everyone has a cell phone.
ATM machine activates when it is approached. Unlike in western ATMs, the customer first chooses the desired option and then inserts the card.
Almost all elevators in Japan speak to indicate if they’re going up or down and which floor they’re on.
Car GPS navigation systems have cartoon characters that bow and speak to the driver.
Japanese photo sticker booths are very popular. Users receive instructions (cartoon character that speaks) and can customize and modify their photos prior to printing.
The screen turns on and the cartoon character bows as the machine is approached.
Train:
Monitors on trains show the map and are also used for advertisement (no sound though).
All photos, text, and videos © 2005 Alejandro Jaimes. All rights reserved.
| |
