August 21, 2007
Guild Name Generator
In our analysis of guild names, we found that a basic grammar was able to parse about 90% of the names. This allowed us to create a guild name generator that used the grammar to create new guild names based on the weighted vocabulary of World of Warcraft guild names.
See full article for description of the guild name grammar we used.
Description of Guild Name Grammar
< > denotes parts of speech
[ ] denotes optional elements
1. Singletons
- [The] <Noun> / <Adjective>
- e.g., Chaos, Brutality, The Wicked, The Legion
2. Simple Noun Phrase
- [The] <Adjective> <Noun>
- e.g., Eternal Angels, The Swinging Swords
3. Complex Noun Phrase
- [The] <Two Word Adjective> <Noun>
- e.g., Pretty Pink Gnomes, The Blood Knuckle Pirates
4. "Of" / "In" Construction
- [The] <Group Term> of [the] [<Adjective>] <Noun>
- e.g., Brotherhood of Light, The Army of Dark Night, Gathering of Bloody Blade Fools, Brothers in Arms
The grammar described can correctly parse about 90% of unique guild names over a 3-month period in WoW. The remaining guild names that fall outside of this grammar tend to be:
1. Prepositional Phrases
- The Darkness Within
- Mad When Wet
2. Single Letter Names
- O T C
- D T A
3. Uses Pronouns / Verb Phrases
- I OWN YOU
- We Eat Allies
4. Contains Foreign Words
- La Fleur de Lys
Cases 1 and 2 are excluded by the parser. Currently, case 3 names are excluded if they use a pronoun and case 4 names are erroneously parsed within the weighted lexicon.
Posted by nickyee at 01:10 AM | Comments (3)
March 02, 2007
Accumulated Leveling Times
The timing of the expansion gave us a very interesting opportunity to estimate leveling times. In the past, we could estimate each individual leveling event, but it was impossible to know the accumulated leveling time of a character if the character was created before we started capturing snapshots. But even if we only included characters created after the snapshots began, we would have to aggregate across different months to get a sizeable pool of characters, and that introduced potential time event confounds (i.e., a certain class was balanced).
What the expansion did was it encouraged many players to start a new character at the same time - specifically with the Draenei and the Blood Elves. We know that all Draenei and Blood Elves were created after January 17th, and there are many of these characters. This allowed us to use a large sample of actually accumulated leveling times to estimate the overall curve.
We started by calculating the average accumulated playing times of Draenei and Blood Elves for each level. The blips in the graph (especially post-50) are due to low samples and potential breaks in the data collection process. However, the graph did hint at an underlying curve.
A curve estimation algorithm showed that the power curve best fit the raw data. The resulting r-squared was .98. In other words, the estimated curve captured about 98% of the variance in the raw data.
Below, we plot out the smoothed curve that was generated. The data suggests that it will take a player on average 15 full days of accumulated playing time to reach level 70, and that the 10-day mark is crossed at approximately level 56.
Our much earlier estimated that it took about 15 full days to reach level 60. This suggests that characters have leveled quicker over time, possibly due to extensive twinking, familiarity with quests and instances, or a well-stocked economy.
One potential bias in this data is that all the high level Draenei and Blood Elves (particularly those who are level 60 and above) are probably more hard-core than the average WoW player. Thus, the high-end of the data might not reflect the average player. One counterargument is that the curve doesn't seem to break. In other words, the accumulated time of level 70 characters does fall in the correct range as would be predicted even if we looked at levels 1-50 alone. This suggests that those high level Draenei and Blood Elves didn't level "quicker" as much as that they spent much more hours playing in the month that the expansion was out. In either case, we would be able to sample the data again once more Draenei and Blood Elves are past level 60 and see whether the curve changes.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: January 17th 2007 - February 17th 2007
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique Draenei and Blood Elf
Sample Size: 42,922 Blood Elves and 35,939 Draenei
Posted by nickyee at 04:40 PM | Comments (2) | TrackBack
February 21, 2007
Characters in BGs after Burning Crusade
The Burning Crusade changed the WoW landscape a lot. One area that was significantly impacted was the number of unique characters in the old BGs (Alterac, Ararthi, Warsong, & Eye of The Storm). The end-game BGs which many level 60 characters spent time in were suddenly almost deserted and back at almost pre-cross-realm queue times. Now, it's intuitive that post-60 content is more appealing to many players than the same old BGs. What may be less obvious is the added pressure for those level 60 players to get the expansion pack. Level 60 players who were content to just PvP now have to endure much much longer wait times to get into BGs. In other words, level 60 life without the expansion pack became difficult. Anecdotally, it also soon became clear to the BG stragglers that the people stuck in those BGs were the ones who didn't have the expansion - yet. It's also interesting to point out this caused a large shift from a competitive orientation (in BGs) to a leveling/achievement orientation (in BC content) for many WoW players.
Data Note: We had a data collection mix-up that accidentally ignored all post-60 characters until the 29th of January. When we did get that data in correctly, the same trend was seen on the 30th and 31st. So the large drop isn't simply due to us not seeing post-60 characters at first.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of January 2007 up till the 16th.
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique level 60 character each day.
Sample Size: Peak of 10,500 characters across 5 servers.
Posted by nickyee at 03:39 PM | Comments (0) | TrackBack
February 09, 2007
New Races Level Progression
We tabulated the levels of all the characters of the two races for each day in January after the expansion roll-out. The following flash widget shows the average level progression by day for the Draenei and the Blood Elves across the 5 servers we monitor.
Posted by nickyee at 04:34 PM | Comments (2) | TrackBack
November 29, 2006
PvP Ranks Change (Basic)
After looking at PvP ranks in one week of time, we decided to explore the changes in PvP rank over time. For this, we took two consecutive one-week periods to calculate the PvP rank change. We start here by providing a sense for how much of the player base we were able to capture.
Of the 128,354 characters, we had PvP rank information for both weeks for 41,997 characters. This turns out to be about 57% of all characters above level 45 (i.e., the average level of Rank 1 characters). While this is only about half of all possible characters, it is large enough of a sample to explore some of the underlying differences.
We found that most characters (80%) do not change rank over a one week period. About 5.5% lost rank and 13.5% gained rank. As the graph below shows, most of the changes occur in the +/- 1 range. Characters who gained more than 2 ranks were all unranked the week before.
Below is a graph that shows the average rank change for characters in each of the 14 ranks. The plot shows that from Rank 1 to Rank 7 that most players tend to gain rank from week to week, but that it is difficult to hold on to your rank once you get to Rank 11 and above. In those ranks, there is an average downward trend. In particular, most characters who were at Rank 14 (the highest rank) give it up as soon as they reach the highest rank. This is consistent with anecdotal data from the WoW forums. After all, once the Grand Marshal equipment has been acquired, there is little incentive to maintain Rank 14.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Two consecutive one-week periods in October, both starting on Tuesday at 10am pacific time (i.e., after ranks have been calculated for that week).
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character in each hour of the day.
Data Filter: None
Sample Size: 128,354 characters
Posted by nickyee at 04:52 PM | Comments (2) | TrackBack
July 28, 2006
Alliance / Horde Ratio Over Time
We were also interested in looking at the Alliance vs. Horde ratio over time. For example, are there incentives to reach parity or do imbalances create increased skews? What we found was surprising in that no significant shifts over time were seen. The ratio of Alliance and Horde appeared incredibly stable over time.
Of note, while there was a severe Alliance imbalance on the PvE and RP servers, there was a matched equilibrium on the PvP server. Again, neither changed over time.
Server Sample: RP (High), PvE (High), PvP (High)
Sampling Period: July 2005 - January 2006
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character in each 2 week period.
Data Filter: Characters above level 1 and who spent less than 95% of their time in a main city.
Sample Size: ~100k characters in each 2 week period
Posted by nickyee at 01:31 PM | Comments (0) | TrackBack
July 21, 2006
The Dreaded Level 60 Scraper Cap, Revisited
For the last week, I have been trying to change the PlayOn scraperbot software to avoid the problem of not seeing all the level 60 characters that may be logged in at any one time.
And the short answer is: Every attempt I've made to improve the collection has only made things worse.
After bashing my head against my keyboard [1], we have decided to recruit the collective wisdom of the web. And to offer a nearly-worthless prize.
Herewith, we offer some Parc-labeled thingy as-yet-unknown (i.e. a coffee mug, or T-shirt, or some such) to anyone who can help us characterize the nature of results returned by the "/who" command in World of Warcraft.
Here's what we know so far...
The "/who" command, and the "Refresh" button in the socials who pane, both use the SendWho API call. Use which means you wish for testing -- they're all the same.
All of these take a filter, which can be used to select for certain names, zones, classes, races, or levels.
So
/who z-"q"
will try to find players with a "q" (in either upper or lower case) as a substring of their guild name, while
/who c-"d"
will return both druids and paladins.
The existing scrapers will try to perform a query of all race+class combinations for one of the factions, such as
/who r-"Troll" c-"Rogue" 1-60
The server will only return 50 results at most. Along with others, we have assumed that if the server returns 50 results, there may be more than 50, and we need to refine our query. On the other hand, we have assumed that if the server returns less than 50 results, it has told us about all logged-in characters that satisfy the filter. Apparently, not so.
So, if the server returned 50 results for the above, we would split the query in the obvious way to
/who r-"Troll" c-"Rogue" 1-30
/who r-"Troll" c-"Rogue" 31-60
and then split each of those in turn until we are searching for a single level of a certain race and class.
From early on, however, if the server was busy, and we queried
/who r-"Dwarf" c-"Paladin" 60-60
we were likely to get 50 results. So, we simply recorded those 50, and subdivided no further.
As the servers we're monitoring have aged, however, we find that more and more race-class combinations are saturating the query results at peak times. This is frustrating, as it calls into question the validity of our data, especially when we are trying to analyze the level 60s, who have reached or are transitioning into the endgame, an important of the WoW landscape.
So, as I mentioned, about a week ago I began to restructure the scraping code, so that rather than splitting by race and class first, it would instead query first by zone, and then by level, and then by race and class. Let's ignore the perils of Blizzard being able to create new zones to be found, or the fact that the filter will not let me query for people only in the zone "Ahn'Qiraj" without also telling me who is in "Gates of Ahn'Qiraj" and "Ruins of Ahn'Qiraj" as well.
In fact, I decided that in order to get the scrape times back down to where they used to be, I would play some shenanigans to do queries like
/who z-"org" 1-60
to check for toons in both "Orgrimmar" and "Searing Gorge" simultaneously, and be able to break it into zone-specific queries only if there was overflow.
When I did this, a surprising thing happened: We started seeing about 50% less inhabitants in the world than the old race-class scrapers. Hmm. Bug in the code? So I thought, until I ran some experiments in-game to flush it out.
Within a few seconds, I did the following queries mid-afternoon on a medium pop server, horde-side:
/who z-"or" 60-60
/who z-"org" 60-60
/who z-"or" 60-60
Guess what: The first and third returned 24 results, none of whom were in Orgrimmar, while the second returned 50 results from Orgrimmar and Searing Gorge.
After further checking, we discovered that we have been seeing 5-10% fewer characters even with full zone names than we see from the old race-class scrapers, whether we split by level->race->class after the zone, or by race->class->level.
We've tested and discarded a number of explanations. One that remains is that WoW does the z-, c-, r-, and levels as stages internally, and has cutoff capacities for the internal results.
In any case, we're wondering if one of you out in the webosphere can explain just what's going on. If you've got an idea, jump into the game, and poke around until you're pretty sure you've grokked what's happening.
Let us know, and we'll send you something virtually worthless.
Posted by at 04:34 PM | Comments (9) | TrackBack
July 14, 2006
Guild Membership and Stability Over Time
We then looked at how guild size and stability change over time. First, we looked at the percentage of characters who were in guilds. There was a mild positive increase over time.
This increase in percentage of guilded characters could mean one of two things. There may be more guilds that spring up, or characters are joining existing guilds. The following chart of average guild size suggests the latter is the case. Over time, established guilds attract more and more characters and increase in size.
Over time, guilds also stabilize. As the following chart shows, members are less likely to quit a guild as a server matures. Overall, these three charts suggest that over time, characters on a server are more and more likely to be in a guild. The guilds they join tend to be established guilds. And over time, guild turn-over decreases.
Server Sample: RP (High), PvE (High), PvP (High)
Sampling Period: July 2005 - January 2006
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character in each 2 week period.
Data Filter: Characters above level 1 and who spent less than 95% of their time in a main city.
Sample Size: ~100k characters in each 2 week period
Posted by nickyee at 05:47 PM | Comments (2) | TrackBack
July 07, 2006
Levels and Grouping Over Time
Our long-term sampling allowed us to look at trends over a long period of time. In the current analysis, we looked at data over a 6-month span.
In that 6-month period, the percentage of level 60 characters has more than doubled (from around 10% to around 24%). That comes out to about an additional 2% of level 60 characters every month.
But it was interesting that even as the level composition of the game space changed in that time, the average grouping ratio was fairly stable. This implies that level composition has a minimal effect on grouping behavior. The sudden slide and drop at the end of the graph is due to an in-game API change. Blizzard suddenly stopped making the "grouped" variable public in early 2006.
Server Sample: RP (High), PvE (High), PvP (High)
Sampling Period: July 2005 - January 2006
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character in each 2 week period.
Data Filter: Characters above level 1 and who spent less than 95% of their time in a main city.
Sample Size: ~100k characters in each 2 week period
Posted by nickyee at 02:41 PM | Comments (0) | TrackBack
June 30, 2006
Naming Patterns in Fantasy Races
We were interested in the names that people picked for their characters. Were there commonalities across the WoW races or did different races have their own naming conventions? So we parsed through all unique names in the January data in several ways - by first letter, by 3-letter prefix, and 3-letter suffix. Here are the top 10 lists for these different parses by race.
There were interesting findings throughout. The most common first letter of a name was "S". 10% of all names in the sample began with an "S". This was followed by "A" (7.6%) and "D" (6.7%). If anyone has access to the distribution of first letters of English names, please let us know how this matches to those.
Among the prefixes, it was interesting to see the prefix "Sha" appear in the top ten for all the races. It was also interesting to see racial differences. For example, we see references to "Moon", "Night" and "Star" among the Night Elf names, whereas we find references to "Dead/Death", "Mort" and "Malevolence" in the Undead names.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of January
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character.
Data Filter: None
Sample Size: 179,003 characters
Posted by nickyee at 02:33 PM | Comments (5) | TrackBack
April 03, 2006
Guild Churn by Server Type
We then looked at whether guild member churn was different across the server types. The data showed that member churn was significantly and consistently higher on PvP servers than RP or PvE servers. The member churn rate on PvP servers is about 75% to 100% higher than that on RP or PvE servers.
Off the top of our heads, we had no explanation for this dramatic difference. It was clear that characters were more likely to leave and switch guilds on a PvP server, but it's not clear whether this is because of the PvP setting or because players who join PvP servers are apriori different from those that join RP or PvE servers.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of January
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique guild with a guild size greater than 1.
Data Filter: None
Sample Size: 5,285 guilds
Posted by nickyee at 12:31 PM | Comments (5) | TrackBack
March 13, 2006
Raid Content Use
In the month of January, we tracked 223043 characters. Of these, 11098 (5%) spent time in high-level raid content (BWL, MC, or ZG). The majority of these were level 60 (as expected) - 99.4%. The remainder were level 56-59 (0.06%).
Of all the level 60s, 30% have spent time in raid content. On average, characters who spent time in raid content spent 310 minutes (about 5 hours) over the month of January in raid content.
Of those who spent any time in raid content, 28% spent less than an hour in raid content. In other words, 72% of these characters spent more than an hour in raid content. Thus, 3.6% of all observed characters spent more than an hour in raid content over the month of January.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of January
Sampling Resolution: ~12 minutes
Parsing Method: All unique characters were tracked.
Data Filter: None
Sample Size: 223,043 characters
Posted by nickyee at 11:45 AM | Comments (26) | TrackBack
March 03, 2006
Guild Churn
It's easy to talk about guilds as somewhat stable entities over a one month period, and by and large, most guilds with more than 10 members do survive from one month to the next. But we were interested in exploring the amount of guild member churn that occurs. For example, given the guilds with 30 members, how many characters were in that guild at some point during the month but are no longer in that guild?
To do this analysis, we tabulated two guild rosters:
Full Guild Roster: For each guild, note down all characters who have been observed to bear this guild tag at any point during the logging period.Current Guild Roster: For each guild, note down only those characters who actually still bear this guild tag.
A character who is in the full guild roster but not the current guild roster is not simply a character who was not observed towards the end of the month. For this difference to occur, they must have deguilded (not bearing any guild tag) or joined another guild (bearing a new guild tag).
Thus for each guild, the difference between those two roster sizes is the member churn - the number of characters who were at one point in the guild but aren't there any longer. Below is the average churn for guilds of different sizes. The churn percentage was around 25% and was fairly stable across guilds of all sizes. In other words, if we see a guild that currently has 20 members, then over the past month, there were 5 members who have left the guild.

Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of January
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique guild with a guild size greater than 1.
Data Filter: None
Sample Size: 5,285 guilds
Posted by nickyee at 12:20 PM | Comments (4) | TrackBack
February 17, 2006
The Level 60 Game
Anecdotally and from our own experiences, the game at level 60 is entirely different from the game pre-60. For one thing, level advancement is no longer the goal and most guilds become raid and instance oriented. We wanted to get a sense of this shift with numbers. And also, we wanted to see whether this is a gradual shift starting at level 40 or level 50, or whether this is indeed a drastic shift that occurs at level 60.
We decided to look at this via social network metrics. How different are characters in guilds at different levels? For this, we calculated the social network metrics (density, centrality, and combined connection times) for each character and found their means according to their level range.
The data suggest a sudden shift at 60 rather than a gradual change. Here are the 3 graphs showing the difference for the 3 metrics mentioned.
Server Sample: RP (High), PvE (High), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of January
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: Only those characters who are in a guild.
Sample Size: 179,003 characters
Posted by nickyee at 12:36 PM | Comments (8) | TrackBack
February 02, 2006
Centrality, Class and Gender
After the previous analysis, we ran an additional one that included the character gender variable. Here, our results were puzzling. Across all of our metrics, male characters were better connected than female characters. And this was true for all classes, with the only exception of Priests. In other words, male characters of all classes are better connected than female characters of all classes, except for female Priests, who are better connected than male Priests. This gender difference was clear and consistent across our three measures of centrality.*
We then ran several analyses to filter out possible explanations and help clarify what may be happening:
1) Are male characters typically in larger guilds than female characters? We ran a quick t-test. While significant, the difference was between 51 and 55, so it can't really account for the difference we're seeing.
2) Do male characters play more than female characters? Again, the t-test was significant, but the difference was insubstantial (1437 vs. 1481 minutes).
3) Do male characters group more than female characters? The t-test here comparing grouping ratios was not significant.
4) Are male characters higher level than female characters? The t-test was significant. The difference was between 33 and 35 - an insubstantial difference.
5) Are there more level 60 male characters than level 60 female characters? 25% of female characters were level 60. 28% of male characters were level 60. Again, a insubstantial difference.
So we're at a loss as to why we're seeing the pattern we're seeing. In every respect we can measure, male and female characters seem to be largely equivalent. Thus, we have two findings we're not sure how to explain. First of all, why are male characters better connected? And secondly, why are female Priests the exception? Any suggestions?
*Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of November
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: None
Sample Size: 179,003 characters
Posted by nickyee at 12:35 PM | Comments (12) | TrackBack
January 13, 2006
Centrality and Class
Centrality is a measure of how well-connected an individual is in a social network. We created three measures of centrality in our exploration of whether a character's centrality in a guild was related to the character's class. In our analysis, we only included characters that were in a guild in the month of November.
Crude Degrees: The number of connections a character has.Degree Centrality: Crude Degrees divided by the total number of possible connections - i.e., the guild size - 1.
Combined Weights: Connections between characters are actually the total time those two characters have spent together. The above two measures count any connection weight greater than zero as a connection. In this metric, we add up all the weights a character has. This represents the total time this character has spent with other members of their guild.
We then plotted these scores by character class.* The results across all three metrics were almost identical. Priests were always best connected. Paladins, Rogues, and Hunters were always least connected. In terms of pure connections, Priests that are in a guild have on average 12 more connections than Paladins who are in guilds. Priests that are in guilds spend about double the time with guild mates that Paladins do.
While it may be tempting to explain all of this by class demand driven by game mechanics, what we can't tease out from this analysis of course is the personality differences involved in choosing a character class. After all, players who choose to be Priests may simply be more gregarious than those who choose to be Paladins and that in and of itself may be accounting for a great deal of the variance.
So I went back and checked the numbers at The Daedalus Project. Players who prefer Priests and Paladins score high on Socializing, so it doesn't look like the desire to Socialize is driving this difference. The Teamwork score almost looks like a better fit. So there's some evidence that part of what we're seeing here is an expression of player motivations in addition to impacts of class demand.
*Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: Month of November
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: None
Sample Size: 179,003 characters
Posted by nickyee at 02:01 PM | Comments (8) | TrackBack
January 04, 2006
Predicting Guild Survival
In an earlier post, we looked at the percentage of guilds that are not seen again after a one-month period. About 21% of guilds are not seen after a period of a month. Since then, we've focused a lot on creating metrics with which to compare guilds, including metrics that take into account the social network of each guild. This allowed us to address a question that we couldn't address before - How well can we predict a guild's survival rate?
To perform this analysis, we took the first and last 10 days of August and extracted all unique guilds in both samples. We only included a guild if its guild size was greater than one. We then calculated the following metrics for each guild:
Size: The number of unique characters seen to bear this guild tag over the 10 days.Max Subgraph: The number of members in the largest subgraph. See this post for more information.
Mass Count: The number of subgraphs in a guild larger than 3. In other words, dyads are not counted.
Total Count: The number of subgraphs in a guild, including dyads.
Density: Connections between guild members can be mapped out as a matrix. The density of a guild is the percentage of cells that are filled in in that matrix.
Centrality: For each guild member, their degree centrality is the number of connections they have divided by the total number of connections they can have (i.e., the guild size - 1). The guild's centrality is the average of all of its character's centrality scores.
We then looked at the data from the last 10 days of August. If a guild seen in early August was not observed in late August, we marked it as "dead". Otherwise, we marked it as "survived". Using this method, we had 4,259 unique guilds in our sample. Of those, 924 (or 22%) were not seen at the end of August and marked as "dead".
We then ran a logistic regression with survival as the dependent variable and the metrics mentioned as the predictor variables. To make a long story short, Size and Mass Count were the only two substantial predictors. Both of those predictors were positively correlated with guild survival. Including the other predictors did not help our model significantly. The r-square for the model using these two predictors was .12.
We then went ahead and calculated the survival likelihood of guilds in our sample. Using a strict cut-off, the model that was provided by the logistic regression was accurate in 73% of the "death" cases and 69% of the "survival" cases. The model provided results that were better than chance alone, but not as strong as we would have liked. On the other hand, the exercise showed that simple metrics of individual guilds can be used to predict their long-term survival.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: First and last 10 days of August.
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique guild with a guild size greater than 1.
Data Filter: None
Sample Size: 4,259 guilds
Posted by nickyee at 05:43 PM | Comments (0) | TrackBack
December 26, 2005
Restarts
Since I first began parsing the census data, I've always noticed that some people made negative progress over the sampling period. In other words, their ending level was lower than their starting level. The only way to do this is to have restarted a character using the same name. I never thought much about this, and always assumed that these were separate players who simply used the same names, and thus I always filtered these characters out (always around 0.3%).
Recently, Nic brought this issue up at our weekly meeting and we finally got around to talking about it. And as we talked about what was driving this phenomenon, it became clear that it probably wasn't the case that we were looking at two different players. Nic and Eric had also noticed this when they first started looking at the data (before I arrived this summer) and they suggested that these were players who decided to recreate a character using that name because they liked that name. And then I realized that there is no reason why someone would delete their character given that even when players quit they might want to come back at a later date. So we all started to realize that the majority of these characters with negative progress are probably made by the same player rather than two different players.
Altogether, we found 567 restarts in our sample. That came out to 0.3% of all characters we parsed. Plotting out the frequencies of characters that have been restarted is quite interesting. There were quite a few high level characters that were restarted.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 10/14/2005 12:00 am - 10/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: See below
Sample Size: 207,298 characters
Posted by nickyee at 08:00 PM | Comments (8) | TrackBack
December 20, 2005
Character Gender and Server Type
As you can guess, it makes me really happy when survey data from The Daedalus Project matches census data here at PlayOn. The gender differences by server type is one of these matches. The survey data showed that players on PvP servers tend to be younger and more likely to be men. And because younger men tend to gender-bend less than older men (the most likely demographic to gender-bend), we would expect that overall that there are fewer female avatars on PvP servers than PvE servers. The census data showed this pattern.
The percentage of female avatars was highest in RP servers and lowest in PvP server.
If we actually calculate the precise gender ratio for the servers, we get a clearer graph. The graph below plots out the number of female avatars per male avatar. On the RP server, it's about 1:2. On the PvP server, it's 1:3.5
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 10/14/2005 12:00 am - 10/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Collection Method: Please see our earlier post to get a sense for how we collected the data on gender.
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: See below
Sample Size: 90,961 characters with known gender
Posted by nickyee at 03:26 PM | Comments (0) | TrackBack
December 12, 2005
Gender, Race, and Class
It's a little eerie how those 3 words have their own meanings in an MMO, but yet when you put them all together, you realize how much weight they carry over from the physical world. And it makes you wonder whether it's really possible to talk about fantasy race without it somehow always implicating race in the physical world. (Ah - but that discussion is for another blog)
Here's what we found in our data. The gender ratio is different for the Alliance and the Horde. There are fewer female characters on the Horde side. One out of three characters is female on the Alliance side. On the Horde side, it is one out of five. Our intuition is that fewer players choose to be female on Horde side because the female Horde characters are kinda ... ugly.
A more fine-grained analysis shows the underlying difference. The top 3 races with the most female avatars are Alliance races - Night Elves, Humans, and Gnomes.
The Daedalus Project data suggests that male and female players are equally represented on both the Alliance and Horde. This implies the observed gender differences are driven almost entirely by gender-bending. Given that players who choose Horde are more likely to be competitive and achievement-oriented than players who choose Alliance who tend to be more customization and role-playing oriented, this makes a great deal of sense. Of course, as many players point out, they gender-bend to have an attractive avatar to look at. Playing a female Horde character would defeat this purpose.
And we finish up with the gender differences by class.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 10/14/2005 12:00 am - 10/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Collection Method: Please see our earlier post to get a sense for how we collected the data on gender.
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: See below
Sample Size: 90,961 characters with known gender
Posted by nickyee at 12:20 PM | Comments (5) | TrackBack
December 05, 2005
Gender Scraping (Overview)
For some time now, we've been experimenting with ways to track the gender of characters in addition to the other variables we've been tracking. Unfortunately, because of the way that Blizzard has constructed the programming interface accessible to the modding community, character gender is not as conveniently available as, say, the character's level. As a result, we can't reliably determine the gender of all characters we have in the census. In effect, the game only allows us to determine a character's gender if it's possible to target them.
So our strategy has been to move our collection characters to the faction capitals (Ironforge and Orgrimmar), and as the census scraper takes the census, we try to target each character seen in the census. If they happen to be near the auction house at this time, we record their gender. This method has two inherent biases. We're more likely to know a character's gender: 1) the more they play, and 2) the higher level they are. The following chart of character levels and whether we know their gender illustrates this bias.
As a character is played more, and becomes higher level, it becomes more and more likely that we'll have seen them at least once while we were collecting a census. On the servers we're watching, we know the genders of 44% of all characters, and the likelihood that we know their gender rises to about 80% by the time the character is level 50. And overall, the characters we know the genders of play about 3-4 times more than the characters we don't know the genders of.
Nevertheless, the results are provocative, and at the same time will confirm the sensibilities of any experienced WoW player. No real surprises, but it's still fascinating to see the results as hard, cold numbers.
And the results? Stay tuned...
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 10/14/2005 12:00 am - 10/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: See below
Sample Size: 207,298 characters
Posted by nickyee at 12:14 PM | Comments (0) | TrackBack
November 28, 2005
Who's Farming?
We just wanted to share some farming-related data that goes well with what many on WoW servers have a gut-feeling about. First, here's the overall class distribution of characters. Given some recent articles on the habits of gold farmers, we felt an easy way of identifying them would be to filter characters by their play-time over the period of one month.
Now, we include only those characters in the top 99% percentile of play-time (n = 2413).
The trend is sharper if we only take the top 99.9% percentile of play-time (n = 245). Rogues and hunters together account for 85% of characters in that range.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: None
Sample Size: 241,378 characters
Posted by nickyee at 03:29 PM | Comments (5) | TrackBack
November 21, 2005
Guild Members: Predictors of Average Character Advancement in Guilds
We were interested in exploring what measurable features of a guild best predict the level advancement of its members. How might factors like guild size or max subgraph size relate to overall advancement of its members?
From the month of August, we calculated the following metrics for the first 10 days of the month and the last 10 days of the month:
Guild Size: The number of unique characters observed to bear the guild tag.
Max Subgraph Size: The size of the largest connected cluster of guild members. See this earlier article for details.
Subgraph (size > 3) Count: The number of subgraphs that had at least 3 members. In other words, all dyads are not counted.
Total Subgraph Count: The total number of subgraphs, including all dyads.
Density: For the matrix of guild members, the density is the number of cells filled out of all possible cells. In other words, how many of the guild's characters have been co-located during the sampling period?
For a measure of advancement, we used the following:
Standardized Character Advancement Score: A character's raw advancement is simply the number of levels the character has advanced. In this case, we subtracted the starting level from the ending level (end of month - beginning of month). The problem is that over a two week period, a 10 level advancement by a level 1 character is much less significant than a 10 level advancement by a level 50 character. In other words, the advancement needs to be qualified by the starting level somehow. The method we used to standardize character advancement was to calculate the average (and standard deviation) of advancement for every starting level. In other words, compared with other characters who also started at level 10, were you above, below, or right on the curve? Mathematically, we did this by calculating the z-score of advancement for every character. Characters who were already level 60 at the beginning of the sampling period were excluded. See this article for more details.
Standardized Guild Advancement Score: As a measure of guild performance and achievement, we averaged the standardized advancement scores of every member in that guild. This guild score was thus how much the guild as a whole advanced during the sampling period. Again, characters who were already level 60 at the beginning of the sampling period were excluded.
A multiple regression showed that the guild metrics from the first 10 days of the month were better predictors than guild metrics from the last 10 days of the month (r-squared = .18 vs. r-squared = .08).
Using only the guild metrics from the first 10 days of the month, we find that the size of the guild is negatively correlated with guild advancement. The bigger a guild is, the slower the members level. Intriguingly, the best positive predictor of guild advancement was not max subgraph size or density but the number of subgraphs with size greater than 3 (referred to in table below as masscount). And the total number of subgraphs was far weaker of a predictor. In other words, dyads don't help and it's really the number of subgraphs with 3 or more members in a guild that helps.
So the numbers are saying that how interconnected a guild is helps but it's got to be interconnected in the right way. The crucial thing seems to be having separate subgraphs that cater to different level bands within your guild that facilitate teaming and leveling throughout your guild. This seems like a reasonable explanation for what the masscount correlation is showing. Our analysis with guild metrics from different times of the month also show how dynamic this influence is. The guild metrics at the beginning of the month were better predictors than those at the end of the month.
The bottom-line seems to be that large guilds do not facilitate character advancement unless they are well-connected and have clusters of guild members for different level ranges.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: None
Sample Size: 241,378 characters; 3,335 guilds
Posted by nickyee at 10:24 PM | Comments (0) | TrackBack
November 13, 2005
Rate of Advancement by Server
The differences by server type were also quite interesting in that they cleanly show that characters on PvP servers level more than characters on PvE servers who in turn level more than characters on RP servers.
Now, characters on PvP servers actually also play more than characters on other server types, by about 2-3 hours each month.
But even if we took this into account and controlled the advancement score by the playing time, it still comes out the same way. Characters on PvP servers level more, spend more time playing, and are fastest at leveling than characters on other servers. Notably, characters on RP servers level the least even though they spend almost as much time playing, but are the slowest levelers.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: See article on Rate of Advancement Overview
Posted by nickyee at 03:22 PM | Comments (4) | TrackBack
Rate of Advancement by Race
The race differences were a little more interesting in that the top 4 races were the Horde races and the bottom 4 races were the Alliance races. The split was surprisingly clean. The split also perfectly matches data from the Daedalus Project on motivational differences between players who choose Horde vs Alliance. It's always good to see two different data methods supporting each other's results.
Again, there were differences in playing time. Notably, Night Elves play just as much as Undead, which is surprising given the advancement difference.
If we plotted out the average level advancement controlling for playing time, we see this difference more clearly. So the Undead level the most over a month, spend the most time playing, and are actually also the fastest levelers. Night Elves on the other hand, spend almost as much time playing, but are the slowest levelers of all the races.
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: See article on Rate of Advancement Overview
Posted by nickyee at 03:19 PM | Comments (3) | TrackBack
November 04, 2005
Rate of Advancement by Class
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: See article on Rate of Advancement Overview
Over the month of August, there were significant differences in how much characters of different classes leveled. The y-axis in the graph below are the standardized scores. So, for example, let's take the .12 for the Rogues. We can refer to the table and pick a certain, say level 30. On average, level 30 characters advanced 5.53 levels over the month. Rogues were .12 standard deviation points higher. The standard deviation from the table is 5.11. So level 30 Rogues on average leveled 5.53 + 5.11 * 0.12 = 6.14 levels. On the other hand, level 30 Druids on average leveled 5.53 + 5.11 * -0.14 = 4.81 levels.
But to a certain extent, this conflates level advancement by playing time. For example, rogues actually also spend more time playing than most other classes.
If we controlled for playing time, we get a more precise sense of actual "rate" of leveling. The huge drop for the Rogue means that most Rogues play more than other characters, and that this is what leads to their higher level advancement, but once we take their higher playing time into account, they aren't the fastest levelers overall.
In summary, Rogues level the most over a period of a month but this is largely because they spend more time playing than other characters. The actual fastest levelers are Priests, but because they spend less time playing, their actual level gain is less than Rogues.
Posted by nickyee at 02:54 PM | Comments (8) | TrackBack
Rate of Advancement (Overview)
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: See below
Sample Size: 241,378 characters
We wanted to explore level advancement at a character level. In other words, a true sense of how much a character levels over a one month period. This also provides an approximation of the player's achievement motivation - how much they want to advance their character as quickly as possible. To do this, we looked at the first 10 days of the month and the last 10 days of the month and included only those characters that were observed in both periods. This was done so that we did not include new characters that started towards the end of the month - who presumably would have had less time to advance than those characters that were already there at the beginning of the month. This sampling method yielded 83,020 characters. We calculated a standardized measure of level advancement as follows.
Standardized Character Advancement Score: A character's raw advancement is simply the number of levels the character has advanced. In this case, we subtracted the starting level from the ending level (end of month - beginning of month). The problem is that over a one month period, a 10 level advancement by a level 1 character is much less significant than a 10 level advancement by a level 50 character. In other words, the advancement needs to be qualified by the starting level somehow. The method we used to standardize character advancement was to calculate the average (and standard deviation) of advancement for every starting level. In other words, compared with other characters who also started at level 10, were you above, below, or right on the curve? Mathematically, we did this by calculating the z-score of advancement for every character.
There were two large groups of characters that were excluded in this analysis. First, we excluded all characters who spent over 90% of their time in a city. We presumed that these were mules of one kind or another and they would simply introduce too much noise. 6,393 (or 7%) of the original sample were excluded this way. Then we excluded all characters who were already level 60 since by definition they couldn't advance anymore. This further excluded 14,408 (or 18.8%) of the remaining sample. Thus, we ended up with a sample of 62,035 characters. The means and standard deviations used to calculate the standardized scores were actually derived from this sample so we were making consistent comparisons.
Here is the plot of average level advancement over August by the starting level. We also have the full table of means and standard deviations below. We're using this article to set the stage for level advancement differences by server, class, and race.

Posted by nickyee at 02:49 PM | Comments (0) | TrackBack
October 31, 2005
Guilds: Densities
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique guild. See also entry on how we derived these social network measures.
Data Filter: None
Sample Size: 5569 guilds
As another measure of guild cohesiveness, we implemented a measure of density - that is to say, given the matrix of all characters in a guild, how many of those cells are filled. For example, a percentage of 25% (using the co-presence metric) means that on average over the month of August, each guild member has been online at the same time with 25% of other guild members at least once. But this also means that on average, each character was never online at the same time as 75% of other guild members over the period of the month.
We ran our analyses for both the co-presence and the co-location metric. Furthermore, we analyzed the data either: 1) with all observed connections, 2) excluding connections with a weight of one (observed only once over a month - approximately 12 minutes), and 3) excluding connections with a weight less than 3 (observed only once or twice over a month - approximately 24 minutes).
There are several weaknesses we'd like to mention up front. First, alts of the same player who are in the same guild can by definition never be co-present or co-located. So presumably, this has an effect on our results as the guild size increases when alts are more frequent. Secondly, characters who switch guilds are counted on the rosters of both guilds for the analysis. So this too may artificially deflate our results.
Still, the overall guild densities were lower than we would have expected. The result indicates that in an average guild in WoW - the average guild has a mean size of 26 and a median size of 10 - over a period of a month, every guild member is co-present with only 30% of his/her guild. In other words, the average guild member is never co-present with 70% of his/her guild members over the period of one month. But if we choose a more conservative co-presence filter (only counting those members who have been co-present for more than 20 mins over the period of a month), the percentage drops by one-third. The average guild member in WoW, over a period of one month, is never online for more than 20 minutes at the same time as 80% of other members of his/her guild.
As a proxy for collaboration, the co-location metric lets us get a sense of how often guild members work together. The analysis shows that over a period of a month, the average guild member collaborates with 11% of his/her guild members for at least 10 minutes. In the table below, "Co-Loc > 1" means counting those members who were co-located for more than 1 time sample (~12 mins).
Here are the two tables for the guild densities analyzed by guild size.
We also calculated whether guild densities correlate with guild size. The co-presence metric does correlate around .21 - .25, but this measure must correlate with guild size - after all, the more dice you roll, the more likely that two will be the same number. The correlation with the co-location metric is more interesting. It does not significantly correlate with guild size. In other words, guild members in large and small guilds are just as likely to work with other members. This raw measure doesn't give a sense of true frequency though. However, this is congruent with our finding that members of large guilds do not spend more time together than members in smaller guilds. Although, again, this may simply be confounded with increase in alts in larger guilds who by definition can never be co-present or co-located.
Posted by nickyee at 05:45 PM | Comments (0) | TrackBack
October 24, 2005
Guilds: Max Subgraph Size
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique guild. See also entry on how we derived these social network measures.
Data Filter: None
Sample Size: 5569 guilds
A subgraph in a social network is a self-contained group of individuals who are interconnected. For example, in the example social network below (using the co-location metric), there are 5 subgraphs - 4 of which are dyads and 1 subgraph containing 6 members. The max subgraph in a social network is the subgraph with the most members. Thus, the max subgraph size in the example is 6.
The max subgraph size of each guild gives a rough sense of cohesiveness. If most members of a guild often do group and work together, they would have a large max subgraph size. In the example shown above, the low max subgraph size relative to the guild size (6 vs. 37) implies that the guild is fairly fragmented and not cohesive.
Unsurprisingly, larger guilds have larger max subgraph sizes. The correlation is r = .63.
But as a function of the guild size, the ratio itself is only weakly correlated with guild size, r = -.08. To a certain extent, this is partly due to the presence of alts who by definition cannot be co-located with the mains.
Plotting this out allows us to see more clearly that there is a steady rise in max subgraph size that essentially starts fluctuating past a guild size of 50. We could almost argue that maximum guild cohesiveness occurs at a guild size of around 50. After that point, it starts getting hard for a guild to remain cohesive.
Plotting this out by subgraph ratio (max subgraph size / guild size) shows a more dramatic trend. The subgraph ratio peaks at around 10 at .50 and drops steadily to below .10. The plot also shows that the correlation between guild size and subgraph ratio is strong only after a guild size of 10. Re-running the correlation after excluding guilds with less than 10 members, we find a correlation of -.36. Again, the larger the guild, the harder it is to remain cohesive.
Posted by nickyee at 01:29 PM | Comments (4) | TrackBack
October 18, 2005
Guild Members: Time Spent Together
Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: 8/01/2005 12:00 am - 8/30/2005 12:00 am
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique character. Each character was tracked across the server logs. Total playing time, lowest observed level, highest observed level, guild affiliation, and zones seen in were parsed.
Data Filter: None
Sample Size: 241,378 characters
From the social network analyses, we had connection frequencies between any two members in all observed guilds. This data allowed us to explore the average amount of time any two members from the same guild are observed together (co-location metric) over a period of a month. In other words, how much time do guild members actually spend working with each other?
We looked at the data in two ways. In our first pass, we looked at the data for every member of the same guild. In other words, the question we were asking was - over a period of a month, what's the average amount of time any two members of the same guild spend together? We also analyzed this data for guilds of different sizes. Interestingly, the result was largely constant across guilds with more than 5 members - with a median between 6 and 9 minutes over a period of a month. The amount of time members in a guild spend together doesn't appear to change as a function of guild size (r = -.02).
We then redid the analysis but only included those dyads in each guild that did spend time together. In other words, the question we were asking was - over a period of a month, for those members of the same guild that spend time together, what is the average amount of time they spend together? Again, we see the same pattern, the result was largely consistent across different guild sizes - the median hovering around 80-87. The correlation between guild size and time spent together was again very weak (r = .04).
While the results seem low, remember that this is the average for all possible dyads within a guild. It's the average amount of time any 2 members of the same guild will spend together in a month. In a guild with 5 people, member A is thus expected to spend about 15 minutes each month with each of the other 4 members.


























































