« The Level 60 Game | Main | Guild Churn »
February 27, 2006
Hearing waves and bows
![]()
Gesturing in MMORPGs is an unusual experience. On the one hand, avatar gestures can appear impressively realistic (e.g., EQII or SWG), and in some cases may even be created through motion-capture of real human bodies. So when I type "/wave" my avatar performs a pretty realistic-looking hand wave. On the other hand, whenever my avatar performs such a gesture, the system also generates a public text emote describing the gesture (exceptions are SL and There). So typing "/wave" also generates a message like, "Bob waves to you." In other words, gesturing in most MMORPGs is multi-modal with both visual and textual components.
So essentially these systems simulate a world in which when people gesture, they also simultaneously announce what they are doing. When I shrug at you, I also say, "Bob shrugs at you." Imagine if we did this in real life. You would not only see people gesture, you would hear them gesture (since reading text is the correlate of hearing voice in most games). This totally changes the organization of gesture in interaction.
You would not necessarily need to coordinate a gesture so that the recipient could see it. As long as you knew he or she could hear it, that could be enough. In fact, in such cases, you wouldn't even need to perform the visual part of the gesture at all. I approach Nic from behind and say, "Bob waves to Nic" without bothering actually to wave. Nic doesn't even turn around but simply returns, "Nic waves to Bob." (There's a cheesy commercial or SNL skit in here somewhere, I'm sure.)
This is basically my experience of gesturing in virtual game worlds. Despite the impressive sophistication of the gesture animations, players tend to rely more on the text emotes instead. The result is a player experience that is more like a text-based MUD, than a three-dimensional, avatar-based world.
For example, take the following interaction from Star Wars Galaxies. In this encounter, Atac is in the player city in which she lives on Naboo, and she heads toward the player association hall to see if anyone is around.
[Star Wars Galaxies: Unseen Waves: Atac's perspective]

01 00:00 ((Atac rounds the corner of the PA hall))
02 00:00* Sin'thea waves to Teli Tubbi.
03 01:04* Sin'thea nods at Kimon Calari.
As Atac rounds the corner of the PA hall (line 01), she sees two text emotes, "Sin'thea waves to Teli Tubbi" and "Sin'thea nods at Kimon Calari." The text emotes alert her to the fact that her friend, Sin'thea, is nearby even though she cannot see the actual avatar animations with which they correspond. In fact, she cannot even see any other avatars from her current vantage point.

04 05:21 ((Atac mouses over Sin'thea's avatar))
05 07:02 ((Atac selects sin'thea's avatar))
06 07:14 Sin'thea: is that Teli I see?
07 10:12* You wave to her.
As she approaches the entrance to the PA hall, Atac catches sight of Sin'thea and mouses over her avatar to double check the name (line 04). Although she is still some distance away and Sin'thea is facing away, Atac nonetheless waves to her friend (line 07). Although Sin'thea most likely cannot see Atac's avatar and although it does not even actually wave (because the waving animation is overridden by the walking animation), she can still see the text emote, "Atac waves to you" (line 07).

08 16:20 Kimon Calari nods at Sin'thea.
09 18:00 ((Atac approaches Sin'thea from behind))
10 19:30* Sin'thea waves to you.
11 ((Sin'thea's avatar automatically spins around
12 and waves))
13 21:06 Goldtooth greets you.
14 22:24 Sin'thea: ATAC!!!
15 22:34 Atac: hiya
16 ((Atac's avatar waves automatically))
17 23:22 Sin'thea gives you a hug.
18 ((Sin'thea's avatar automatically gives half a hug))
Atac stops on the steps behind Sin'thea just in time to receive a return wave (line 10) and the two friends continue their greetings (lines 14-18). So in this case, the two players rely almost entirely on the text emotes that accompany gestures in achieving mutual orientation rather than on their avatars. In such cases, which are not uncommon, the text emotes render the avatars more or less irrelevant.
In a similar vein, gesture text emotes can also destroy some of the subtlety and indirection in interaction by making actions overly explicit. For example, in EQII, your avatar will do a series of sexy modeling-like poses if you type "/flirt." But it also announces in text that "You flirt shamelessly with X" thus categorizing your action explicitly as "flirting." So much for the subtle dance of seduction! Similarly in SWG, if you type "/wink" (more a facial expression than a gesture really), your avatar winks, but the system also produces the message: "You wink suggestively at X." I found this very problematic because often when I tried to wink at someone to indicate that I was joking, it came off instead as a come on due to the verbiage of the text emote. In real life, embodied gestures afford much more strategic ambiguity.
So should developers dispense with the text emotes that automatically accompany gestures animations?
YES... but perhaps not until they fix other problems with gesturing in virtual game worlds. Text emotes currently serve as a kind of Band-Aid for gesture systems that are broken. Without text emotes, players are much more likely to miss gestures directed to them. In real face-to-face, performing a gesture successfully requires making sure the intended recipient can see it. This involves seeing where the recipient is looking and holding (or prolonging the duration of) the gesture until the recipient has displayed some sign of recognition and understanding. This is not possible in most current avatar systems. They don't indicate to you when the other player has detached his view from his avatar's orientation (i.e., by panning and zooming); they don't indicate when the other player's view is obscured by menus (e.g., maps; except in There); and they don't allow you to control the duration of your gestures (see #1, #6 & #8 of my 10 thing about avatar interaction). So until such features are implemented, we may have to make due with systems telling us about gestures rather than better enabling us simply to see them.
Posted by Bob
Posted at February 27, 2006 03:21 PM
Trackback Pings
TrackBack URL for this entry:
http://blogs.parc.com/cgi-bin/mt-tb.cgi/74
Comments
Interesting, but, as you suggested, other problems must be remedied before we eliminate the text record. The medium already hinders communication and forces operation in a form of tunnel-vision. In crowded areas, where the LOD manager has kicked down detail to the point of ridiculous, a text log that records emotes can be the only way to know any activity has taken place.
While I find the descriptions annoying and sometimes over the top, I'm becoming more concerned that the animations themselves are becoming... well... more "animated."
In SWG, my stoic, soldierly type would often /nod to others that greet him. It was a simple look at another person's direction and a subtle movement of the head. In EQ2, the nod is repeated, pronounced, and (IIRC) accompanied with a thumbs-up gesture. It looks like something The Fonz from happy days would use. In SWG, the flirt was somewhat subtle, in WQ2, it's anything BUT subtle, with, at one point, the character's butt literally waving at the target's animation.
So, we have limited libraries of gestures, oftentimes the gesture associated with a name varies widely between games, and ieven if the gesture exists, it is animated in a way that's not appropriate for a character's personality. Many times, we resort to text-only emotes to fill in the gaps, and if there are going to be text-only emotes, I'd rather have all emote actions recorded in the text log for continuity sake.
Unfortunately, when it's not worth development time to resolve issues with basic object interaction like sitting on a chair (EQ2), I don't have much hope for animation libraries robust enough for good nonverbal communication, but maybe I'm just growing cynical.
Posted by: Chas at March 1, 2006 06:45 AM
Chas - excellent points! EQ2 is definitely an interesting beast in terms of avatar animations. On the one hand, both player characters and NPCs do brilliant automated glances at each other whenever they come close, which really make the character models come alive! On the other, the facial expressions have been entirely translated in odd ways into gross body movements (no doubt because facial animations are often impossible to see at a distance).
Even many of the basic body gestures are not only exaggerated, as you point out, they are just plain wrong. The animation for /nod, which I use all the time, is not actually a nod at all. It's a funny chin thrust plus a finger point. If it were not for the text emote that says "Bob nods," I wouldn't recognize it as a nod at all. Similarly the /wave animation is more of a tip of the hat (without the hat) and the /shrug involves a confusing side-step. Yet others like /point, /bow, and /no look right.
Given the wide library of avatar animations in EQ2, SWG and even WoW to a certain extent, devs are obviously spending significant development time on them. Unfortunately they seem to care more about creating humorous animations (e.g., /flex, /violin, /shimmy, /heelclick) than getting the basic ones right.
I think the best solution is probably a licensed avatar-interaction engine, like the Quake physics engines. That way one company (hopefully with the assistance of conversation analysts) can take the time to get the details of avatar interaction right, while the licensees can focus on creating dungeons and what not.
Posted by: Bob at March 2, 2006 05:25 PM
An "emote engine" would be a great boon for players, in that you could go from EQII to SWG to WoW and know that a /nod is a /nod is a /nod.
However, I'm not sure I'd want to trade some of the quirkier anims (the total freak out /dance in EQ for instance) for complete homogeneity, much as I bang on about consistency across games whenever I get the chance. ;)
Perhaps classifying emotes into 'essential' 'fun' and 'unique' where 'unique' are the ones that belong to that specific game would allow a degree of conformity but at the same time allow the devs (or perhaps modders using Poser..?) to create new emotes/dances which can be added to the library IF the game designers allow that level of customisation.
Posted by: almagill at March 16, 2006 03:24 PM
I do this all the time with certain word. Maybe because chat is as natural to me as speaking.
I often say, "Shrug" on the phone and face to face.
Posted by: Mike Ellis at June 9, 2006 09:50 AM
