« Centrality, Class and Gender | Main | Devs'll find work for idle avatars »

February 08, 2006

Why players adopt 3rd party VoIP apps

WhyVoIPthumb.jpgOne reason, although certainly not the only reason, players of MMORPGs use 3rd party VoIP applications, such as TeamSpeak or Ventrilo, is that standard text chat is often too slow and cumbersome for many activities that require tight coordination between players, such as raids or PvP. While text chat is good for certain kinds of activities in games (e.g., chatting across zones, advertising wares), it is not so good for real-time joint activities (e.g., traveling together, fighting together). There are of course a few different reasons for this, most of which Pavel Curtis pointed out years ago. One is that the typing itself is somewhat slow (especially compared to speaking). Another is that text chat increases the burden on the users' hands, with which they may also be wanting to do other things at the same time, such as firing off a spell or navigating their avatar. Yet another reason is that most games (with the exception of There) implement standard text chat in which the composition of turns is kept private. In other words, turns are posted on a message-by-message basis (rather than a word-by-word or character-by-character basis). This creates interactional lag...

Now although players can be quite good at managing this interactional lag, it can at times cause slippages that impact their joint activities. If we look at avatar interactions under a "microscope," we can see how this slippage occurs. The following transcript (or annotated chat log) comes from a screen-capture video collected from EverQuest II. It begins as Ataca and Rattington are roaming around outside the gates of West Freeport looking for some animals to hunt together for xp. (Note: double parentheses mark transcriber's notes and italics mark system-generated messages.)

WhyVoIPfull.jpg

[Hunting Armadillos: EverQuest II; Ataca's perspective]
01 07:38:16 ((Ataca stops running))
02 07:38:54 ((Ataca targets an armadillo))
03 07:41:38 ((Rattington stops next to Ataca))
04 07:41:38 Ataca points at a banded armadillo.
05 ((Ataca's avatar points)
06 07:43:06 ((Ataca rearranges icons in her toolbar))
07 07:50:00 ((Ataca mouses over an attack spell))
08 07:51:00 ((Ataca mouses over armadillo))
09 07:54:16 ((Ataca initiates an attack by clicking a
10 spell icon in her toolbar & her avatar begins
11 a spell-casting animation))
12 07:57:42 You surround A young armadillo with arcane chains!
13 07:58:18* Rattington says to the group, "I see two that
14 are grouped but I think we could take them."
15 08:00:10 ((Ataca clicks another spell icon))
16 08:01:58 Rattington hits banded armadillo for 4 points of
17 crushing damage.
18 08:03:34 You strike A banded armadillo with a storm of lightning!
19 08:04:08 Rattington hits a banded armadillo for 3 points of
20 crushing damage.
21 08:05:02 Ataca says, "hehe"

In this episode, Ataca approaches some banded armadillos, selects them and "points" to them (lines 01-05) by clicking on a toolbar icon for "pointing." In this context, such an action can be seen as a proposal to attack the armadillos, although the more standard practice is to "hail" them by selecting them and pressing the H-key. Rattington can observe Ataca's pointing by virtue of an avatar animation (line 05) and a text emote (line 04). Ataca then privately rearranges icons on her toolbar (line 06) for about 8 seconds, while Rattington appears to do nothing. Ataca then proceeds to initiate an attack on the armadillos with a click of a spell icon on her toolbar (lines 09-11). Rattington can observe this by virtue of a spell-casting animation of Ataca's avatar, an alert sound, and eventually a text message after the spell is cast (line 15). But then an interactional slippage becomes apparent. Four seconds after Ataca has initiated the attack Rattington says, "I see two that are grouped but I think we could take them" (lines 13-14). In other words, he offers his assessment about whether the armadillos are suitable targets; however, it appears too late to be consequential for the attack due to the interactional lag caused by the chat system. He almost certainly started typing his turn before Ataca initiated the attack, but she could not see his turn unfold in real-time. Rattington then joins in the attack (line 16), and Ataca chuckles at their apparent lapse in coordination (line 21).

These two low-level characters do in fact eventually defeat the two perturbed armadillos without much difficulty. However had this been a 60-person raid against an epic dragon or a PvP encounter against an opposing team of players, this kind of lapse in coordination can have greater consequences. In such situations, having a real-time medium for talking with your teammates, like voice (VoIP or co-presence) or even word-by-word chat, is a distinct advantage. Tighter coordination between players can be achieved when avatars, which move in real-time (or nearly real-time), have voices to match.

Of course, VoIP does not magically solve everything. It creates interactional issues of its own, but that's a topic for a future post. Some of these issues will no doubt be discussed at the panel on "Community and Communications in Massive Multi-Player Online Games" at VON 2006, Tuesday, March 14, 2006, 1:00pm - 2:15pm in which Nick Yee and I will be participating.

Posted by Bob

Posted at February 8, 2006 11:41 AM

Trackback Pings

TrackBack URL for this entry:
http://blogs.parc.com/cgi-bin/mt-tb.cgi/68

Comments

Although VoIP is nice for this kind of grinding in groups, i have found that usage in larger groups can also be problematic.

Of course there are huge benefits, but consider the "noise" when more than one person speaks at a time. If several people are talking in a group, it is easy to hear what all people are saying, as well as know WHO is saying WHAT. If multiple people speak at the same time using VoIP, sometimes you can't hear or even decipher what is being said by all players, let alone know who was actually saying it.

Additionally, there is some lag as well to VoIP. If players do not have their own local sound turned down, you can hear the other player's environment sounds after they occur in your own UI.

Posted by: Matt at February 13, 2006 01:31 PM

...there is, of course, an opposing view on the voice vs text issue. I may be its sole representative ....

I type very fast and I think very fast with my fingers. I also express myself much better in a written format. If I were in a real-life conversation with you and trying to say the same things I'm typing here, they'd probably be less verbose, but they'd also be less understandable.

I also don't like voice programs because I dislike the sound of my own voice ... and there is a social consideration that I am always playing female characters, for reasons of my own ... while I've never made a big secret of it or used it to deceive, I really don't want to go onto TS and have to answer questions or defend why a female character has a male voice; it's just not a fuss I want to get into. (Some players, usually young males, are really bothered by the idea of other male players having female characters. But that too is a digression.)

Anyway, point is, *I* can keep up with text, but I realize that a lot of other people can't. What this means in practice is that the sort of in-game communication that needs to happen for players to group effectively, which was never good in text format to begin with, gets steadily worse for the text-only player. The rule has become "Get voice or get left behind," and I resent it.

In terms of the need to talk while fighting or doing other things, I think that's greatly exaggerated. You don't socialize while fighting anyway. (Actually, in MMORPGs there is barely any social conversation at all, and that's another gripe and a third digression.) I operate on the "initial briefing" plan - actions carefully coordinated before the fight, and mostly silence during the fight. It works pretty well - assuming you find a group that is willing to sit still for the planning beforehand ... and that's the real problem.

I may be overly cynical, especially since I'm anywhere from ten to twenty years older than the average player (depending on the game), but I think the real problem is that to function well in a text medium takes more patience and focus than the average player has - they just want to dash off and start swinging, over and over, and the voice method is more suited to the rash impulse.

---

By the by, the non-cached message units in There had their own problems. I don't like having other people see me forming the thoughts as I type them ... but it bothered me far less than some of my friends there, who felt that it made them look stupid ... that everyone could see how slowly they formed thoughts or how many times they had to go back in mid-sentence and correct themselves.

Eventually, though, a sense of humor developed about it, and one standard joke among my There peers was to begin typing something and then deliberately strike it out and type the "corrected" version, the same way old Unix geeks might write "Our duly elected moron^H^H^H^H^Hmayor" or use strikeout code.

Posted by: Todd at February 14, 2006 12:12 PM

Todd - I too almost always use text chat for the same reasons. But I've found that when tight coordination matters most, the interactional lag created by the standard chat systems can cause coordination troubles. If I played a male character and did raids or PvP most of the time, I think I'd probably switch to voice. Of course, that sort of game play is starting to look a lot like an online first-person shooter where voice is the standard.

Posted by: Bob at February 14, 2006 12:16 PM

(Actually, in MMORPGs there is barely any social conversation at all, and that's another gripe and a third digression.)

you should join my guild then. /gchat is entirely social chat, almost to the point of irritation. There is rarely any kind of quest/raid/pvp discussion there. Plenty of Chuck Norris and other spam, as well as actual (good) social exchange.

Posted by: Matt at February 14, 2006 01:24 PM

Matt - Nice points. The 3rd party VoIP applications certainly have serious limitations, especially since they're not integrated with the game software itself. (I hear that There.com has one of the best voice systems at the moment.)

How to simulate the "cocktail party effect" - one's perceptual ability to focus on a single conversation within a larger cacophony of many voices - in online audio spaces is still a major technological hurdle (although there is actually a PARC technology that tackles just this problem).

Of course, there is no reason that game developers must choose between voice OR text chat. I'd like to see more sophisticated, nimble text chat AND voice systems offered in today's virtual worlds.

Posted by: Bob at February 14, 2006 02:00 PM

By the by, I mentioned some of the comments I made in that post to an old friend and heavy WOW player, and she disagreed with my thought that most of the necessary planning can be done before missions. She runs the sort of hardcore stuff I was never really able to do because of time issues. (I don't know how familiar you are with WOW-speak, but she's in a guild that spends most of their time these days running Molten Core and the other high-end instances that require LARGE groups and a big time investment.)

I agree that in situations like that, you have to pause and regroup/replan every so often along the way, but I still believe that you can balance the typing and the fighting. She disagrees strongly; she doesn't think she could do MC effectively without having a voice program. Since she has far more experience with big groups than I do, I'm not prepared to contradict her. In short, I think my position is valid for the sort of small groups I played, but it may not scale up well.

Posted by: Todd at February 14, 2006 02:02 PM

Hi Bob and everyone,

I ended up here looking at Bob's homepage (I am also a speaker on that panel next month).

Matt makes a good point: with monaural audio, there is no intelligibility of any of the voices during overtalk. In a real cocktail party, you hear with two ears. The brain has the ability to filter by vector of origin, and so you can tune out all the stuff you don't want to hear and focus on the voice of interest.

With stereo, not only is this problem solved but also the very logical one in a 3D game: when someone says something tactical (e.g. "cover me!") you know where to point the gun, vs. having to spin 360 to find where the person is.

At the risk of appearing overtly commercial in an inappropriate forum (appologies in advance if so), I would like to say that DiamondWare has built a 3D VoIP system with APIs designed to be intgrated into a game or VR simulation. If you're interested in a demo, please email me at keith@dw.com.

Bob, I would especially like to trade demos with you, to see this PARC technology you reference as well as show you our stuff and see what kind of collaboration oppportunities there are. The technology also has huge enterprise collaboration and workspace community applications (I noticed this was another PARC research area).

Posted by: Keith Weiner at February 18, 2006 01:09 PM

Sort of a stale topic, but what the hey.

I find that voicechat ruins my immersion, and barely overcomes one of the fundamental reasons ingame chat systems suck: they're designed by gamers, not warriors or wargamers.

Immersion-wise voicechat never represents what I imagine avatars to sound like, given what I know of their personality and appearances. Voicechat fidelity is pretty poor, and so isn't good for social situations (scummy roleplayer, I am), and the timing tends not to integrate with textual emotes players may be generating. Worse, it creates a social stratification between the "haves" and the "have nots". This has been seen in There.

Militarily most ingame chats are poorly integrated with the UI, don't provide secure chatlines, are terribly laggy, and don't provide easy ways to create command control hierarchies. I am a neurotic type and teach my guildmates along the lines of German 'auftragstaktik', wherein individual initiative is encouraged at all levels, and C2 (command control) is only generally exercised from the top. Thusly, when things are going well and with the (outline) of a plan, there really is little communication required. Good discipline and training minimize how much needs to be said, so when things are difficult, the comms needs are still minimal. This is how it goes in the real world, even though most modern combatants have personal short-range radios.

If I had securechats, fast chat systems, UI integration such as target designation, group waypoints, and a common operational picture (I see on an overhead map what my guildies see), then I wouldn't need voicechat at all.

Designers really need to read Rick Shelley's "Dirigent Mercenary Corps" books, particularly those guys who do SF games like SWG. Even in the FPSs, the C2 systems suck and are inferior to what you can buy off the shelf today.

Posted by: DoogieHozer at April 17, 2006 02:22 PM

Post a comment




Remember Me?

(you may use HTML tags for style)